WORLD INTELLECTUAL PROPERTY ORGANIZATION 
■ ■ Internationa] Bureau 




INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCI) 



i 

p 
o 



(51) International Patent Classification 5 : 

C12N 15/54, 0/12, 1/21 
// (C12N 1/21, C12R 1/21) 



Al 



(11) International Publication Number: 
(43) International Publication Date: 



WO 92/06202 

16 April 1992(16.04.92) 



(21) International Application Number: PCT/US91/07076 

(22) International Filing Date: 26 September 1991 (26.09.91) 



(30) Priority data: 
590,490 



28 September 1990 (28.09.90) US 



(60) Parent Application or Grant 
(63) Related by Continuation 

US 590,490 (CIP) 

Filed on 28 September 1990 (28.09.90) 

(71) Applicant (for all designated States except US): CETUS 
CORPORATION [US/US]; 1400 Fifty-Third Street, 
Emeryville, CA 94608 (US). 



(72) Inventors; and 

(75) Inventors/Applicants (for US only) : GELFAND, David, H. 
[US/USJ; 6208 Chelton Drive, Oakland, CA 94611 
(US). LAWYER, Frances, C {US/USJ; 6641 Saroni 
Drive, Oakland, CA 94611 (US). ABRAMSON, Ri 
chard, D. [US/US]; 5901 Broadway, #30, Oakland, CA 
94618 (US). GREENFIELD, L, Lawrence [US/US]; 36 
Wildwood Court. Pleasant Hill, CA 94523 (US). REI- 
CHERT, Fred, L [US/US]; 2845 Cannei Street, Oak- 
land, CA 94602 (US). 

(74) Agent: SIAS, Stacey, R; Cetus Corporation, 1400 Fifty- 
Third Street, Emeryville, CA 94608 (US). 

(81) Designated States: AT (European patent). \U, BE (Euro- 
pean patent), CA, CH (European patent), DE (Euro- 
pean patent), DK (European patent), ES (European pa- 
tent), FR (European patent), GB (European patent), GR 
(European patent), IT (European patent), JP, LU (Euro- 
pean patent), NL (European patent), SE (European pa- 
tent), US- 
Published 

With international search report 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 



(54) Tide: PURIFIED THERMOSTABLE NUCLEIC ACID POLYMERASE ENZYME FROM THERMOSIPHO AFRICA- 
NUS 



(57) Abstract 

A purified thermostable enzyme is derived from the eubacterium Thermosipho africanus. The enzyme has DNA polymer- 
ase activity, reverse transcriptase activity, and optionally 5' 3', and/or 3' -+ 5' exonuclease activity. The enzyme can be native 
or recombinant, and may be used with primers and nucleoside triphosphates in a temperature-cycling chain reaction where at 
least one nucleic acid sequence is amplified in quantity from an existing sequence. 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States party to the PCT on the front pages of pamphlets publishing international 
applications under the PCT. 



AT 


Austria 


es 


Spain 


MC 


Madagascar 


AU 


Australia 


Fl 


Finland 


ML 


Mali 


88 


Barbados 


FR 


France 


MN 


Mongolia 


BE 


Belgium 


CA 


Gabon 


MR 


Mauritania 


8F 


Burkina Fa*o 


G8 


United Kingdom 


MW 


Malawi 


BG 


Bulgaria 


GN 


Guinea 


NI- 


Netherlands 


8J 


Benin 


GR 


Greece 


NO 


Norway 


BR 


Brazil 


HU 


Hungary 


PL 


Poland 


CA 


Canada 


IT 


Italy 


RO 


Romania 


CP 


Central African Republic 


JP 


Japan 


SO 


Sudan 


OG 


Congo 


KP 


Democratic- People's Republic 


SC 


Sweden 


CH 


Switzerland 




of Korea 


SN 


Senegal 


a 


Cote d*lvotrc 


KR 


Republic of Korea 


su* 


Soviet Union 


CM 


Cameroon 


LI 


liechtcmtcin 


TO 


Chad 


CS 


Czechoslovakia 


UC 


Sri Lanka 


TC 


Togo 


de* 


Germany 


UJ 


Luxembourg 


US 


United States of America 


OK 


Denmark 


MC 


Monaco 







+ Any designation of "SU" has effect in the Russian Federation. It is not yet known whether 
any such designation has effect in other States of the former Soviet Union. 



WO 92/06202 



1 



PCT/US91/07076 



PURIFIED THERMOSTABLE NUCLEIC ACID POLYMERASE 
ENZYME FROM THRRMOSIPHO AFRTCANUS 

Field of the Invention 

The present invention relates to a purified, thermostable DNA polymerase 
5 purified from the thermophilic hacteria Thermosioho afticanus fTafl and means for 
isolating and producing the enzyme. Thermostable DNA polymerases are useful in 
many recombinant DNA techniques, especially nucleic add amplification by die 
polymerase chain reaction (PCR). 

gagkerounlAn ^ 
10 Extensive research has been conducted on the isolation of DNA polymerases 

from mesophilic microorganisms such as E- coli . See, for example, Bessman £t al., 
1957, 1 BioL Chem. 22&171-177, and Buttin and Romberg, 1966, 1 Biol, Chem . 
241:5419-5427. 

Much less investigation has been made on the isolation and purification of DNA 
15 polymerases from thermophiles such as Taf. Kaledin e|&L» 1980, Biokhvtmva 45:644- 
651, disclose a six-step isolation and purification procedure of DNA polymerase firom 
cells of Thermus aquaticus YT-1 strain. These steps involve isolation of cmde extract, 
DEAE-cellulose chromatography, fractionation on hydroxyapatite, fractionation on 
DEAE-cellulosc, and chromatography on single-strand DNA-cellulose. The molecular 
20 weight of the purified enzyme is reported as 62,000 daltons per monomelic unit 
A second purification scheme for a polymerase from Thermus aquaticus is 
described by Chien £t fiL, 1976, £ Bacteriol . 127:1550-1557. In this process, the 
crude extract is applied to a DEAE-Sephadex column. The dialyzed pooled fractions 
are then subjected to treatment on a phosphocellulose column. The pooled fractions are 
25 dialyzed and bovine serum albumin (BSA) is added to prevent loss of polymerase 
activity. The resulting mixture is loaded on a DNA-cellulose column. The pooled 
material from the column is dialyzed and analyzed by gel filtration to have a molecular 
weight of about 63,000 daltons and by sucrose gradient centrifugation of about 68,000 
daltons. 

30 The use of thermostable enzymes, such as those described in U.S. Patent No. 

4,889,818, to amplify existing nucleic acid sequences in amounts that are large 
compared to the amount initially present was described United States Patent Nos. 
4,683,195 and 4,683,202, which describe the PCR process, both disclosures of which 
are incorporated herein by reference. Primers, template, nucleoside triphosphates, the 

35 appropriate buffer and reaction conditions, and polymerase are used in the PCR 
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process, which involves dcnataration of target DNA, hybridization of primers, and 
synthesis of complementary strands* The extension product of each primer becomes a 
template for the production of die desired nucleic acid sequence. Hie two patents 
disclose that, if the polymerase employed is a thermostable enzyme, then polymerase 
5 need not be added after every denaturation step, because heat will not destroy the 
polymerase activity. 

United States Patent No. 4,889,818, European Patent Publication No. 
258,017, and PCT Publication No. 89/06691, the disclosures of which are 
incorporated herein by reference, all describe the isolation and recombinant expression 

10 of an -94 kDa thermostable DNA polymerase from Thermus aquaticus and nhe use of 
that polymerase in PCR. Although T. aquaticus DNA polymerase is especially 
preferred for use in PCR and other recombinant DNA techniques, thane remains a need 
for other thermostable polymerases. 

Accordingly, there is a desire in the art to produce a purified, thermostable DNA 

15 polymerase that may be used to improve the PCR process described above and to 
improve the results obtained when using a thermostable DNA polymerase in other 
recombinant techniques such as DNA sequencing, nick-translation, and even reverse 
transcription. The present invention helps meet that need by providing recombinant 
expression vectors and purification protocols for a DNA polymerase from Taf. 

20 Summary of the Invention 

Accordingly, the present invention provides a purified thermostable enzyme that 
catalyzes combination of nucleoside triphosphates to form a nucleic acid strand 
complementary to a nucleic acid template strand. The purified enzyme is the DNA 
polymerase I activity from Taf . In a preferred embodiment, the enzyme is isolated from 

25 2M strain OB-7(DSM 5309). This purified material may be used in a tempCTature- 
cycling amplification reaction wherein nucleic add sequences are produced from a 
given nucleic acid sequence in amounts that are large compared to die amount initially 
present so that die sequences can be manipulated and/or analyzed easily. 

Hie gene encoding Taf DNA polymerase I enzyme from Tsf has also been 

30 identified and cloned and provides yet another means to prepare the thermostable 

enzyme of the present invention. In addition to die portions of the gene encoding the 
Tgf enzyme, derivatives of these gene portions encoding Taf DNA polymerase I activity 
are also provided. 

The invention also encompasses a stable enzyme composition comprising a 
35 purified, thermostable Taf enzyme as described above in a buffer containing one or 
more non-ionic polymeric detergents. 
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Finally, the invention provides a method of purification for the thermostable 
polymerase of the invention. This method involves preparing a crude extract from Taf 
or recombinant host cells, adjusting the ionic strength of the crude extract so that the 
DNA polymerase dissociates from nucleic acid in the extract, subjecting the extract to at 
5 least one chromatographic step selected from hydrophobic interaction chromatography, 
DNA binding protein affinity chromatography, nucleotide binding protein affinity 
chromatography, and cation, anion, or hydroxyapatite chromatography. In a preferred 
embodiment, these steps are performed sequentially in the order given above. The 
nucleotide binding protein affinity chromatography step is preferred for separating the 
10 DNA polymerase from endonuclease proteins. 

Prisf Description of the Figures 

Figure 1 shows various PGR profiles- 
Figure 2 shows the effect of various PGR profiles on amplification. 
Figure 3 shows various PGR profiles. 

15 Petailed Description of the Invention 

The present invention provides DNA sequences and expression vectors that 
encode Taf DNA polymerase L To facilitate understanding of the invention, a number 
of terms are defined below. 

The terms "cell", "cell line", and "cell culture" can be used interchangeably and 

20 all such designations include progeny. Thus, the words "transformants" or 

"transformed cells" include the primary transformed cell and cultures derived from that 
cell without regard to the number of transfers. All progeny may not be precisely 
identical in DNA content, due to deliberate or inadvertent mutations. Mutant progeny 
that have the same functionality as screened for in the originally transformed cell are 

25 included in the definition of transformants. 

The term "control sequences" refers to DNA sequences necessary for the 
expression of an operably linked coding sequence in a particular host organism* The 
control sequences that are suitable for procaryotes, for example, include a promoter, 
optionally an operator sequence, a ribosome binding site, and possibly other 

30 sequences. Eucaryotic cells are known to utilize promoters, polyadenylation signals, 
and enhancers. 

Hie term "expression system" refers to DNA sequences containing a desired 
coding sequence and control sequences in operable linkage, so that hosts transformed 
with these sequences are capable of producing the encoded proteins. To effect 
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transformation, the expression system may be included on a vector; however, the 
relevant DNA may also be integrated into die host chromosome* 

Hie term "gene" refers to a DNA sequence that comprises control and coding 
sequences necessary for die production of a recoverable bioactive polypeptide or 
5 precursor. Hie polypeptide can be encoded by a full length gene sequence or by any 
portion of the coding sequence so long as the enzymatic activity is retained 

The term "opeiably linked" refers to the positioning of the coding sequence 
such that control sequences will function to drive expression of the protein encoded by 
the coding sequence. Thus, a coding sequence "operably linked" to control sequences 

10 refers to a configuration wherein the coding sequences can be expressed under the 
direction of a control sequence. 

The term "mixture" as it relates to mixtures containing Taf polymerase refers to 
a collection of materials which includes Taf polymerase but which can also include 
other proteins. If the Taf polymerase is derived from recombinant host cells, the other 

15 proteins will ordinarily be those associated with the host Where the host is bacterial, 
the contaminating proteins will, of course, be bacterial proteins. 

The term "non-ionic polymeric detergents" refers to surface-active agents that 
have no ionic charge and that are characterized for purposes of this invention, by an 
ability to stabilize the Taf enzyme at a pH range of from about 3.5 to about 9.5, 

20 preferably from 4 to 8.5. 

The term "oligonucleotide" as used herein is defined as a molecule comprised of 
two or more deoxyribonucleotides or ribonucleotides, preferably more than three, and 
usually more than ten. Hie exact size will depend on many factors, which in turn 
depends on the ultimate function or use of the oligonucleotide. The oligonucleotide 

25 may be derived synthetically or by cloning. 

The term "primer" as used herein refers to an oligonucleotide which is capable 
of acting as a point of initiation of synthesis when placed under conditions in which 
primer extension is initiated. An oligonucleotide "primer" may occur naturally, as in a 
purified restriction digest or be produced synthetically. Synthesis of a primer extension 

30 product which is complementary to a nucleic acid strand is initiated in the presence of 
four different nucleoside triphosphates and the Taf thermostable enzyme in an 
appropriate buffer at a suitable temperature. A "buffer" includes cofactors (such as 
divalent metal ions) and salt (to provide the appropriate ionic strength), adjusted to the 
desired pH. For Taf polymerase, the buffer preferably contains 1 to 3 mM of a 

35 magnesium salt, preferably MgCl2, 50 to 200 |iM of each nucleotide, and 0.2 to 1 pM 
of each primer, along with 50 mM KCi, 10 mM Tris buffer (pH 8.0-8.4), and 100 
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jxg^tnl gelatin (although gelatin is not required, and should be avoided in some 
applications, such as DNA sequencing). 

A primer is single-stranded for maximum efficiency in amplification, but may 
alternatively be double-stranded. If double-stranded, the primer is first treated to 
5 separate its strands before being used to prepare extension products. The primer is 
usually an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the 
synthesis of extension products in the presence of the polymerase enzyme. The exact 
length of a primer will depend on many factors, such as source of primer and result 
desired, and the reaction temperature must be adjusted depending on primer length and 
10 nucleotide sequence to ensure pioper annealing of primer to template. Depending on 
the complexity of the target sequence, an oligonucleotide primer typically contains 15 to 
35 nucleotides. Short primer molecules generally require lower temperatures to form 
sufficiently stable complexes with template. 

A primer is selected to be "substantially" complementary to a strand of specific 
15 sequence of the template. A primer must be sufficiently complementary to hybridize 
with a template strand for primer elongation to occur. A primer sequence need not 
reflect the exact sequence of the template. For example, a non-complementary 
nucleotide fragment may be attached to the 5' end of the primer, with the remainder of 
the primer sequence being substantially complementary to the strand. Non- 
20 complementary bases or longer sequences can be interspersed into the primer, provided 
that the primer sequence has sufficient complementarity with die sequence of the 
template to hybridize and thereby form a template primer complex for synthesis of the 
extension product of the primer. 

The terms "restriction cndonucleases" and "restriction enzymes" refer to 
25 bacterial enzymes which cut double-stranded DNA at or near a specific nucleotide 
sequence. 

The terms "thermostable polymerase" and "thermostable enzyme" refer to an 
enzyme which is stable to heat and is heat resistant and catalyzes (facilitates) 
combination of the nucleotides in die proper manner to form primer extension products 

30 that are complementary to a template nucleic acid strand. Generally, synthesis of a 
prima: extension product begins at the 3' end of the primer and proceeds in the 5' 
direction along the template strand, until synthesis terminates. 

The Taf thermostable enzyme of the present invention satisfies the requirements 
for effective use in the amplification reaction known as the polymerase chain reaction or 

35 PCEL The Taf enzyme does not become irreversibly denatured (inactivated) when 
subjected to the elevated temperatures for the time necessary to effect denaturatioii of 
double-stranded nucleic acids, a key step in the PGR process. Irreversible denaturation 
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for purposes herein refers to permanent and complete loss of enzymatic activity. The 
heating conditions necessary for nucleic acid denaturation will depend, e.g„ on the 
buffer salt concentration and the composition and length of the nucleic acids being 
denatured, but typically range from about 90°C to about 105°C for a time depending 
5 mainly on the temperature and the nucleic acid length, typically from a few seconds up 
to four minutes. Higher temperatures may be required as the buffer salt concentration 
and/or GC composition of the nucleic acid is increased. The Taf enzyme does not 
become irreversibly denatured for relatively short exposures to temperatures of about 
90°C-100°C 

10 The Taf thermostable enzyme has an optimum temperature at which it functions 

that is higher than about 45°C Temperatures below 45°C facilitate hybridization of 
primer to template, but depending on salt composition and concentration and primer 
composition and length, hybridization of primer to template can occur at higher 
temperatures (e.g M 45-70°Q, which may promote specificity of the primer 

15 hybridization reaction. The Taf enzyme exhibits activity over a broad temperature range 
from about 37°C to 90°C 

The present invention provides DNA sequences encoding the thermostable 
DNA polymerase I activity of Taf. The encoded amino acid sequence has homology to 
portions of the thermostable DNA polymerases of Thermus species Z05 (TZ05), 

20 Thermotoga maritima fTmal Thermus aquaticus (TacQ strain YT1, X thermophilus 
CQh), and Thermus species spsl7 (Tspsl7). The entire TM coding sequence and the 
deduced amino acid sequence is depicted below as SEQ ID NO: L Hie amino acid 
sequences is also listed as SEQ ID NO: 2. For convenience, the amino acid sequence 
of this Taf polymerase is numbered for reference* Portions of the 5' and 3 V noncoding 

25 regions of the laf DNA polymerase I gene are also shown. 

1 GAATTCTTGAAGAAGGGACTTTAAATACTAAGAGGTTTTTTAACT 
46 TAGATGGAAATGTTTACAAAAAGGGTGCATTAGATGAGAAAACAA 
9 1 AGGAATTAATGGGACTTGTTGCTTCAATGGTTTTAAGGTGTGATG 
136 ATTGTATT ACTTATC ATATG ATAAGGTGTGCACAACTTGGAGTT A 
30 181 GTGATGAAGAATTTTTTG AAACTTTTGATGTGGCATTGATAGTTG 

226 GAGGTTCAATAGTAATTCCTCATTTAAGACGTGCTGTTAAATTGC 



35 



271 TTGAGGATATCAGGGAGATGCAAAAAAATGGG AAAGATGTTTCTA 
1 MetGlyLysMetPheLeu 

316 TTTGATGGAACTGGATTAGT ATACAGAGCATTTTATGCTATAGAT 
7 Phe AspGlyThrGly LeuVa lTy r ArgAlaPheTy r Alal leAsp 



361 CAATCTCTTCAAACTTCGTCTGGTTTACACACTAATGCTGTATAC 
22 GlnSerLeuGlnThrSerSerGlyLeuHisThrAsnAlaValTyr 

406 GGACTTACTAAAATGCTTATAAAATTTTTAAAAGAACATATCAGT 
3 7 ClyLcuThrLy sMet Leul leLysPheLeuLy sGluHi s I leS er 

451 ATTGGAAAAGATGCTTGTGTTTTTGTTTTAGATTCAAAAGGTGGT 
52 IleGlyLysAspAlaCysValPheValLeuAspSerLysGlyGly 

4 96 AGCAAT^AAAAGAAAGGATATTCTTGAAACATATAl^AGCAAATAGG 
67 SerLysLysArgLysAspIleLeuGluThrTyrLysAlaAsnArg 

541 CCATCAACGCCTGATTTACTTTTAGAGCAAATTCCATATGTAGAA 
82 ProSerThrProAspLeuLeuLeuGluGlnlleProTyrValGlu 

586 GAACTTGTTGATGCTCTTGGAATAAAAGTTTTAAAAAT AGAAGGC 
97 GluLeuValAspAlaLeuGlylleLysValLeuLysIleGluGly 

631 TTTGAAGCTGATGACATTATTGCTACGCTTTCTAAAAAATTTGAA 
1 12 PheGluAlaAspAspIlelleAlaThrLeuSerLysLysPheGlu 

67 6 AGTGATTTTGAAAAGGTAAACATAATAACTGGAGATAAAGATCTT 
127 Ser AspPheGluLy s ValAsnl le I leThrGlyAspLy s AspLeu 

721 TTACAACTTGTTTCTGATAAGGTTTTTGTTTGGAGAGTAGAAAGA 
142 LeuGlnLeuValSerAspLysValPheValTrpArgValGluArg 

7 66 GGAATAACAGATTTGGTATTGTACGATAGAAATAAAGTGATTGAA 
157 GlylleThrAspLeuValLeuTyrAspArgAsnLysVallleGlu 

811 AAATATGGAATCTACCCAGAACAATTCAAAGATTATTTATCTCTT 
172 LysTyrGlylleTyrProGluGlnPheLysAspTyrLeuSerLeu 

856 GTCGGTGATCAGATTGATAATATCCCAGGAGTTAAAGGAATAGGA 
187 ValGlyAspGlnlleAspAsnlleProGlyValLysGlylleGly 

901 AAGAAAACAGCTGTTTCGCTTTTGAAAAAATATAATAGCTTGGAA 
202 LysLysThrAlaValSerLeuLeuLysLysTyrAsnSerLeuGlu 

946 AATGTATTAAAAAATATTAACCTTTTGACGGAAAAATTAAGAAGG 
217 AsnValLeuLysAsnlleAsnLeuLeuThrGluLysLeuArgArg 



8 

991 CTTTTGGAAGATTCAAAGGAAGATTTGCA2UAAAGTATAGAACTT 
232 LeuLeuGlxiAspSerLysGluAspLeuGlnLysSerlleGluLeu 

1036 GTGGAGTTGATATATGATGTACCAATGGATGTGGAAAAAGATGAA 
247 ValGluLeuIleTyrAspValProMetAspValGluLysAspGlu 

1081 ATAATTTATAGAGGGTATAATCCAGATAAGCTTTTAAAGGTATTA 
2 62 IlelleTyrArgGlyTyrAsnProAspLysLeuLeuLysValLeu 

1126 AAAAAGTACGAATTTTCATCTATAATTAAGGAGTTAAATTTACAA 
2 77 LysLysTyrGluPheSerSer IlelleLysGluLeuAsnLeuGln 

1171 GAAAAATT AGAAAAGGAATATATACTGGTAGATAATGAAGATAAA 

2 92 GluIjysLeuGluLysGluTyrlleLeuValAspAsnGliiAspLys 

1216 TTGAAAAAACTTGCAGAAGAGATAGAAAAATACAA2VACTTTTTCA 
3 07 LeuLysLysLexxAlaGluGluIleGluLysTyrLysThrPheSer 

12 61 ATTGATACGGAAACAACTTCACTTGATCCATTTGAAGCTAAACTG 
322 IleAspThrGluThrThrSerLeuAspProPheGlxiAlaLysLeu 

1306 GTTGGGATCTCTATTTCCACAATGGAAGGGAAGGCGTATTATATT 
337 ValGlylleSerlleSerThrMetGluGlyLysAlaTyrTyrlle 

1351 CCGGTGTCTCATTTTGGAGCTAAGAATATTTCCAAAAGTTTAATA 
352 ProValSerHisPheGlyAlaLysAsnlleSerLysSerLeuIle 

1396 GATAAATTTCTAAAACAAATTTTGCAAGAGAAGGATTATAATATC 
367 AspLysPheLeuLysGlnlleLeuGlnGluLysAspTyrAsnlle 

1 4 41 GTTGGTCAGAATTTAAAATTTGACTATGAGATTTTTAAAAGCATG 
382 ValGlyGlnAsnLeuLysPheAspTyrGluIlePheLysSerMet 

1486 GGTTTTTCTCCAAATGTTCCGCATTTTGATACGATGATTGCAGCC 

3 97 GlyPheSerProAsnValProHisPheAspThrMet IleAlaAla 

1531 T ATCTTTTAAATCC AGATGAAAAACGTTTTAATCTTGAAGAGCT A 
4 12 Ty rLeuLeuAsnProAspGluLysArgPheAsnLeuGluGluLeu 

1576 TCCTTAAAATATTTAGGTTATAAAATGATCTCGTTTGATGAATTA 
427 SerLeuLysTyrLeuGlyTyrLysMetlleSerPheAspGluLeu 

1621 GTAAATGAAAATGTACCATTGTTTGGAAATGACTTTTCGTATGTT 
4 42 ValAsnGluAsnValProLeuPheGlyAsnAspPheSerTyrVal 

1666 CCACTAG2VAAGAGCCGTTGAGTATTGCTGTGAAGATGCCGATGTG 
457 P roLeuGluArgAlaValGluTyrSerCy sGluAspAlciAspVal 

1711 ACATACAGAATATTTAGAAAGCTTGGTAGGAAGATATATGAAAAT 
472 ThrTy r Ar gl lePhe ArgLy sLeuGlyArgLy s IleTy rGluAsn 

1756 GAGATGGAAAAGTTGTTTTACGAAATTGAGATGCCCTTAATTGAT 
487 GluMetGluLysLeuPheTyrGluIleGluMetProLeuIleAsp 
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1 8 01 GTTGTTTCAGAAATGG71ACTAAATGGAGTGTATTTTGATGAGGAA 
502 ValLeuSerGluMetGluLeuAsnGlyValTyrPheAspGluGlu 

1846 TATTTAAAAGAATTATCAAAAAAATATCAAGAAAAAATGGATGGA 
5 517 TyrLeuLysGluLeuSerLysLysTyrGlnGluLysMetAspGly 

1891 ATT AAGGAAAAAGTTTTTGAGATAGCTGGTGAAACTTTCAATTTA 
532 HeLysGluLysValPheGluIleAlaGlyGluThrPheAsnLeu 

10 1936 AACTCTTCAACTCAAGTAGCATATATACTATTTGAAAAATTAAAT 
547 AsnSerSerThrGlnValAlaTyrlleLeuPheGluLysLeuAsn 

1981 ATTGCTCCTTACAAAAAAACAGCGACTGGTAAGTTTTCAACTAAT 
5 62 HeAlaProTyrLy sLysThrAlaThrGlyLysPheSerThrAsn 

15 

2026 GCGGAAGTTTTAGAAGAACTTTGAAAAGAACATGAAATTGCAAAA 
577 AlaGluValLeuGluGluLeuSerLysGluHisGluIleAlaLys 

2071 TTGTTGCTGGAGTATCGAAAGTATCAAAAATTAAAAAGTACATAT 
20 592 LeuLeuLeuGluTyrArgLysTyrGlnLysLeuLysSerThrTyr 

2116 ATTGATTCAATACCGTTATCTATTAATCGAAAAACAAACAGGGTC 
607 IleAspSerlleProLeuSerlleAsnArgLysThrAsnArgVal 

25 2161 CAT ACTACTTTTCATCAAACAGGAACTTCTACTGGAAGATTAAGT 
622 HisThrThrPheHisGlnThrGlyThrSerThrGlyArgLeuSer 

2206 AGTTCAAATCCAAATTTGCAAAATCTTCCAACAAGAAGCGAAGAA 
637 SerSerAsnProAsnl-euGlnAsnLeuProThrArgSerGluGlu 

30 

2251 GGAAAAGAAATAAGAAAAGCAGTAAGACCTCAAAGACAAGATTGG 
652 GlyLysGluIleArgLysAlaValArgProGlnArgGlnAspTrp 

22 96 TGGATTTTAGGTGCTGACTATTCTCAGATAGAACTAAGGGTTTTA 
35 667 TrpIleLeuGlyAlaAspTyrSerGlnlleGluLeiaArgValLeu 

2341 GCGCATGTAAGTAAAGATGAAAATCTACTTAAAGCATTTAAAGAA 
682 AlaHisValSerLysAspGluAsnLeuLeuLysAlaPheLysGlu 

40 2386 GATTTAGATATTCATACAATTACTGCTGCCAAAATTTTTGGTGTT 
697 AspLeuAspIleHisThrlleThrAlaAlaLysIlePheGlyVal 

2431 TCAGAGATGTTTGTTAGTGAACAAATGAGAAGAGTTGGAAAGATG 
712 SerGluMetPheValSerGluGlnMetArgArgValGlyLysMet^ 

45 

2476 GTAAATTTTGCAATTATTTATGGAGTTTCACCTTATGGTCTTTCA 
727 ValAsnPheAlallelleTyrGlyValSerProTyrGlyLeuSer 

2521 AAGAGAATTGGTCTTAGTGTTTCAGAGACTAAAAAAATAATAGAT 
50 742 LysArglleGlyLeuSerValSerGluThrLysLysIlelleAsp 

2566 AACTATTTTAGATACTATAAAGGAGTTTTTGAATATTTAAAAAGG 
757 AsnTyrPheArgTyrTyrLysGlyValPheGluTyrLeuLysArg 
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2611 ATGAAAGATGAAGCAAGGAZ^AAAAGGTTATGTTACAACGeTTTTT 
772 MetLysAspGluAlaArgLysLysGlyTyrValTlirThrLeuPlie 

2656 GGAAGGCGCAGATATATTCCACAGTTAAGATCGAAAAATGGTAAT 
5 787 GlyArgArgArgTyrlleProGlnLeuArgSerLysAsnGlyAsn 

2701 AGAGTTC AAGAAGG AGAAAGAATAGCTGTAAACACTCC AATTC AA 
802 ArgValGlnGluGlyGluArglleAlaValAsnThrProIleGln 

10 274 6 GGAACAGCAGCTGATATAATAAAGATAGCTATGATTAATATTCAT 
817 GlyThrAlaAlaAspIlelleLysIleAlaMetlleAsnlleHis 

2791 AAT AGATTGAAGAAGGAAAATCT ACGTTCAAAAATGAT ATTGC AG 
832 AsnArgLeuLysIiysGluAsnLeuArgSerLysMetlleLeuGln 

283 6 GTTCATGACGAGTTAGTTTTTGAAGTGCCCGATAATGAACTGGAG 
847 ValHisAspGluLeuValPlieGluValProAspAsnGluLeuGlu 

2881 ATTGTA2VAAGATTTAGTAAGAGATGAGATGGAAAATGCAGTTAAG 
20 862 IleValLysAspLeuValArgAspGluMetGluAsnAlaValLys 

2926 CTAGACGTTCCTTTAAAAGTAGATGTTTATTATGGAAAAGAGTGG 
877 IieuAspValProLeuLysValAspValTyrTyrGlyLysGluTrp 

25 2971 GAATAATGGCTGGGGTAAAGGAATTTAAAGATCTAATAGAATTAA 
892 Glu 

3016 ATG AATATGTTACAAAAAAAATAGAATTGACGGGTCTT ACAAGTG 
3061 AGACCTTT AGGTTTTATGCAGATGTTGTTAG AGCC AAT AACCATT 

30 3106 CTACAGGTTTGTATATTGATGTTTCACAACCTTATACTGCAAAGA 
3151 ATGGAACAAGAAATATTGAAATTACTGTGTATGTAGCTAGATATG 
3196 TTGCACCAAAAATTTTGGAAGTTATAAAAGTTTCTAATGTTAAGG 
3241 AACTTGTTGGGAAAAAATGGATTTTTCAAGGGAGACTTTCTTTTT 
3286 TCAGAGATAGAATGAGTTTTACCTTCTATGCAGATACAATAGCTC 

35 3331 CGATGGGAGAATCTGAGATTGAAAAAAGAAGAAAAGAAATATTGA 
3376 AAGAGCTTGAGGTT AGAAATTTATTAATGAAAGAAAAGCATGATC 
3421 TTTCTGAATTGCCACCAATAAAAAAGATTGCTATTATAACATCTA 
3466, AAAGTGCAGCGGGTTACGAAGATTTTTTAAAAAACTTGACAGTTC 
3511 ATTATTTGTACCGCCCT ATTGTTCACCTTTATGAATCACCT ATGC 

40 3556 AAGGGGC AC AGACTGC ATCTGGTATT ATTTTAGCGCTT AATCGT A 
3601 TAAGAAAATCGAATATAGACTATGATGTTGTTGTTATTGCTCGTG 
3646 GCGGTGGTGCAAGAAGCGATCTGATGTATTTTGATGATTTGTCAC 
3691 TTGGAATAGAAATTGCAAAGTTTAATGAGTATTGTCCAATTTTAT 
3736 CGGGC AT AGGTCATGAAAG AGATTTT AC AATTCC AGATT ATGTTG 

45 3781 CCTGGAAGAGATTTGCTACTCCGACAGAAGTTGCAAGAGCTAT AT 
3826 CAAAGCAAATAGAAGATAATGTGAAAAAATTGGATGATAGTTATA 
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3871 ATGACTTAAGGATTTTACTTTCTAATGTTTTTAAAATCTATGAGA 

3916 gaacggtagaattgggtctgataga!ttatatgaagt^aagttatag 
3961 gaagcg attttataaagatagtaaaagatctggatgaaacttatg 
4006 aaaagattgaaaattttgttagttacaaaataaatgattcatctc 
5 4051 agagactatctgaagattttttaaggtttatgtctaattctcttg 
4096 aaaataagttgaaatccaaaaaggacagtgttgaz^aattttgaaa 

4141 AAATACTTGAAAAAGATATATCAATTTTACTTTCAAATAAAGAGA 
4186 CAATGCTTAATGAAACATTTCAGGAGCTTTTAAAACGAGAAGAAT 
4231 TTGCACCACTTTTATTTGGTGGGGCATTGGTTATGAAAAGTGGAC 
10 4276 ATT^TGTAAAA 

The above nucleotide sequence was identified by a "degenerate primer" method 
that has broad utility and is an important aspect of die present invention- In the 
degenerate primer method, DNA fragments of any thermostable polymerase coding 
sequence corresponding to conserved domains of known thermostable DNA 

15 polymerases can be identified 

The degenerate primer method was developed by comparing the amino acid 
sequences of DNA polymerase I proteins from Tag. Tth, T7, and & £Qli in which 
various conserved regions were identified. Primers corresponding to these conserved 
regions were then designed. As a result of the present invention, Taf sequences can be 

20 used to design other degenerate primers, as can the coding sequences of the Thermus 

species spsl7 DNA polymerase I gene (see PCT Publication No. , filed 

September 30, 1991, and incorporated herein by reference) and the Thermotoga 

maritima DNA polymerase I gene (see PCT Publication No* , filed 

August 13, 1991, and incorporated herein by reference), and the Thermus species Z05 

25 DNA polymerase I gene (see PCT Publication No. , filed 

September 30, 1991, and incorporated herein by reference). The generic utility of the 
degenerate prima: process is exemplified herein by specific reference to the method as 
applied to cloning die Taf DNA polymerase I gene. 

To clone the Taf DNA polymerase I gene, regions of conserved amino acid 

30 sequences of DNA polymerase I enzymes were converted to all of the possible codons 
which represent each of the amino acids. Due to the degenerate nature of the genetic 
code, a given amino acid may be represented by several different codons. Where more 
than one base can be present in a codon for a given amino acid, the sequence is said to 
be degenerate. 

35 The primers were then synthesized as a pool of all of the possible DNA 

sequences that could code for a given amino acid sequence. The amount of degeneracy 
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of a given primer pool can be determined by multiplying die number of possible 
nucleotides at each position* 

The greater the number of individual unique primer DNA sequences within a 
primer pool, the greater the probability that one of the unique primer sequences will 
5 bind to regions of the target chromosomal DNA other than die one desired; hence, the 
lesser the specificity of die resulting amplification* To increase die specificity of the 
amplification using degenerate primers, the pools are synthesized as subsets such that 
the entire group of subsets includes all possible DNA sequences encoding the given 
amino acid sequence, but each individual subset only includes a portion: for example, 

10 one pool may contain either a G or C at a certain position while another pgel contains 
either an A or T at the same position. As described herein, these subpoolsare 
designated with aDG number (where number is between 99 and 200). 

Both forward primers (directed from the 5' region toward the 3' region of the 
gene, complementary to the noncoding strand) and reverse primers (directed from the 

15 3 f region of the gene toward the 5" region of the gene, complementary to the coding 
strand) were designed for most of die conserved regions to clone Taf polymerase. The 
primers were designed with restriction sites at the 5* ends of the primers to facilitate 
cloning. The forward primers contained a BglH restriction site (AGATCT), while the 
reverse primers contained an EcoRI restriction site (GAATTC). In addition, the 

20 primers contained 2 additional nucleotides at the 5' end to increase the efficiency of 
cutting at the restriction site. 

Degenerate primers were then used in PCR processes to amplify chromosomal 
DNA from Taf , The products of the PCR processes using a combination of forward 
and reverse primer pools in conjunction with a series of temperature profiles were 

25 compared. When specific products of similar size to the product generated using Tag 
chromosomal DNA were produced, the PCR fragments were gel purified, reamplified 
and cloned into the vector pBSM 13+Hindffl::Bgin (a derivative of the Stratagene™ 
vector pBSM13+, now marketed as pBS+, in which the Hindm site of pBSM13+ was 
converted to a B gin site). The PCR fragments were cloned and sequenced; fragments 

30 woe identified as potential thermostable DNA polymerase coding sequences if the 

fragments contained sequences that encode regions of amino acid homology to other ~ 
known polymerase protein sequences, particularly those of Tag polymerase and Tth 
polymerase. 

The portions of the Taf DNA polymerase gene were then identified in the 
35 chromosomal DNA of Taf by Southern blot analysis. The 2M chromosomal DNA was 
digested with a variety of enzymes and transferred to nitrocellulose filters. Probes 
labeled with 32p or biotin-dUTP were generated for various regions of the gene from 
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the cloned PGR products. The probes were hybridized to the nitrocellulose-bound 
genomic DNA, allowing identification of the molecular weight of the chromosomal 
DNA fragment hybridizing to the probe. Hie use of probes covering the 5' and 3* 
legions of die gene ensures that the DNA fragments) contain most if not all of die 
5 structural gene for the polymerase. Restriction enzymes can be identified that can be 
used to produce fragments that contain the structural gene in a single DNA fragment or 
in several DNA fragments to facilitate cloning. 

Once identified, chromosomal DNA encoding portions of the Taf DNA 
polymerase gene was cloned. Chromosomal DNA was digested with the identified 

10 restriction enzymes, and size fractionate Fractions containing the desired-size range 
were concentrated, desalted, and cloned into die pBSM13+Hindffl::Bgin cloning 
vector. Clones were identified by hybridization using labeled probes generated from 
the previous cloned PCR products. The cloned fragments were identified by restriction 
enzyme analysis and Southern blot analysis. 

15 The DNA sequence and amino acid sequence shown above and the DNA 

compounds that encode those sequences can be used to design and construct 
recombinant DNA expression vectors to express Taf DNA polymerase activity in a 
wide variety of host cells. A DNA compound encoding all or part of the DNA 
sequence shown above can also be used as a probe to identify thermostable polymerase- 

20 encoding DNA from other organisms, and the amino add sequence shown above can 
be used to design peptides for use as immunogens to prepare antibodies that can be 
used to identify and purify a thermostable polymerase. 

Whether produced by recombinant vectors that encode die above amino acid 
sequence or by native 3M cells, however, Tfof DNA polymerase will typically be 

25 purified prior to use in a recombinant DNA technique. The present invention provides 
such purification methodology. 

For recovering the native protein, the cells are grown using die method of 
Huber £J M-t 1989, System App. Microbial. 12:32-37. After cell growth, the isolation 
and purification of the enzyme takes place in six stages, each of which is carried out at a 

30 temperature below room temperature, preferably about 0° to about 4°C, unless stated 
otherwise. 

In the first stage or step, the cells, if frozen, are thawed, disintegrated by 
ultrasound, suspended in a buffer at about pH 7.5, and centrif uged. 

In the second stage, the supernatant is collected and then fractionated by adding 
35 a salt such as dry ammonium sulfate. The appropriate fraction (typically 45-75% of 
saturation) is collected, dissolved in a 0.2 M potassium phosphate buffer preferably at 
pH 6.5, and dialyzed against the same buffer. 
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The third step removes nucleic acids and some protein. The fraction from die 
second stage is applied to a DEAE-ceilulose column equilibrated with the same buffer 
as used above. Then the column is washed with the same buffer and the flow-thiough 
protein-containing fractions, determined by absorbance at 280 nm, axe collected and 
5 dialyzed against a 10 mM potassium phosphate buffer, preferably with die same 
ingredients as the first buffer, but at a pH of 7.5. 

The fourth step consists of hydioxyapatite chromatography; the fraction so 
collected is applied to a hydioxyapatite column equilibrated with die buffer used for 
dialysis in the third step. The column is then washed and the en2yme eluted with a 

10 linear gradient of a buffer such as 0.01 M to 0.5 M potassium phosphate buffer at pH 
7.5 containing 10 mM 2-mercaptoethanol and 5% glycerol. The pooled fractions 
containing thermostable DNA polymerase activity are dialyzed against die same buffer 
used for dialysis in the third step. 

The fifth stage consists of anion exchange chromatography; the dialyzed 

15 fraction is applied to a DEAE-cellulose column, equilibrated with the buffer used for 
dialysis in the third step. The column is then washed and the enzyme eluted with a 
linear gradient of a buffer such as 0.0 1 to 0.6 M KQ in the buffer used for dialysis in 
the third step. Fractions with thermostable enzyme activity are then tested for 
contaminating deoxyribonucleases (endo- and exonucleases) using any suitable 

20 procedure. For example, the endonuclease activity may be determined 

electrophoretically from the change in molecular weight of phage lambda DNA or 
supercoiled plasmid DNA after incubation with an excess of DNA polymerase. 
Similarly, exonuclease activity may be determined electrophoretically from the change 
in molecular weight of restriction enzyme-cleaved DNA after treatment with the DNA 

25 polymerase fraction. The fractions determined to have polymerase activity but no 

deoxyribonuclease activity are pooled and dialyzed against the same buffer used in the 
third step. 

The sixth step consists of DNA binding protein affinity chromatography; the 
" pooled fractions are placed on a phosphocellulose column with a set bed volume. The 
30 column is washed and the enzyme eluted with a linear gradient of a buffer such as 0.01 
to 0.8 M KC1 in a potassium phosphate buffer at pH 7.5. The pooled fractions having 
thermostable polymerase activity and no deoxyribonuclease activity are dialyzed against 
a buffer at pH 8.0. 

The molecular weight of the DNA polymerase purified from Taf may be 
35 determined by any technique, for example, by SDS-PAGH analysis using protein 
molecular weight markers. The molecular weight, calculated from the coding 
sequence, of Taf DNA polymerase I is 103,273~daltons. The purification protocol of 



WO 92/06202 



PCT/US91/07076 



15 

native Taf DNA polymerase is described in detail in Example L Purification of the 
recombinant Taf polymerase of the invention can be carried out with similar 
methodology. 

The entire coding sequence of the 3M DNA polymerase gene is not required, 
5 however, to produce a biologically active gene product with DNA polymerase activity. 
The availability of DNA encoding the Taf DNA polymerase sequence provides the 
opportunity to modify the coding sequence so as to generate mutein (mutant protein) 
forms also having DNA polymerase activity. The amino(N)-terminal portion of the Tgf 
polymerase is not believed to be necessary for polymerase activity. Using recombinant 

10 DNA methodology, one can delete up to approximately one-thiid^of the N-terminal . 
coding sequence of the Taf gene, clone, and express a gene product that is quite active 
in polymerase assays. Because certain N-tenrrinal shortened forms of the polymerase 
are active, the gene constructs used for expression of these polymerases can include the 
corresponding shortened forms of the coding sequence. 

IS In addition to the N-terminal deletions, individual amino acid residues in the 

peptide chain of Taf polymerase may be modified by oxidation, reduction, or other 
derivation, and the protein may be cleaved to obtain fragments that retain activity. Such 
alterations that do not destroy activity do not remove the protein from the definition of a 
protein with Taf polymerase activity and so are specifically included within the scope of 

20 the present invention. 

Modifications to the primary structure of the T&f DNA polymerase gene by 
deletion, addition, or alteration so as to change the amino acids incorporated into the 
Taf DNA polymerase during translation can be made without destroying the high 
temperature DNA polymerase activity of the protein. Such substitutions or other 

25 alterations result in the production of proteins having an amino acid sequence encoded 
by DNA falling within the contemplated scope of the present invention. Likewise, the 
cloned genomic sequence, or homologous synthetic sequences, of the Jgf DNA 
polymerase gene can be used to express a fusion polypeptide with Tgf DNA 
polymerase activity or to express a protein with an amino acid sequence identical to that 

30 of native Taf DNA polymerase. In addition, such expression can be directed by a 
control sequence that functions in whatever host is chosen to express the TM DNA 
polymerase. 

Thus, the present invention provides a coding sequence for DNA 
polymerase from which expression vectors applicable to a variety of host systems can 
35 be constructed and the coding sequence expressed. Portions of the Taf polymerase- 
encoding sequence are also useful as probes to retrieve other thermostable polymerase- 
encoding sequences in a variety of species. Accordingly, oligonucleotide probes that 
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encode at least four to six amino adds can be synthesized and used to retrieve additional 
DNAs encoding a thermostable polymerase. Because there may not be an exact match 
between the nucleotide sequence of the thermostable DNA polymerase gene ofTaf and 
the corresponding gene of other species, oligomers containing approximately 12-18 
5 nucleotides (encoding the four to six amino sequence) are usually necessary to obtain 
hybridization under conditions of sufficient stringency to eliminate false positives. 
Sequences encoding six amino acids supply ample information for such probes. 

The present invention, by providing coding sequences and amino acid 
sequences for Taf DNA polymerase, therefore enables the isolation of other 

10 thermostable polymerase enzymes and the coding sequences for those enzymes. The 
deduced amino acid sequence of die Xgf DNA polymerase I protein is similar to the 
amino add sequences for other thermostable DNA polymerases, such as those from 
Tag and TJh (see PCT Publication No. 91/09950, incorporated herein by reference). 
However, regions of dissimilarity between the coding sequences of the 

1 5 thermostable DNA polymerases can also be used as probes to identify other 

thermostable polymerase coding sequences which encode enzymes having some 
properties of one known thermostable polymerase and perhaps different properties. 
For example, the coding sequence for a thermostable polymerase having some 
properties of Tag and other divergent properties of Taf may be identified by using 

20 probes comprising regions of dissimilarity between Tag and Taf. 

Whether one desires to produce an enzyme identical to native Taf DNA 
polymerase or a derivative or homologue of that enzyme, the production of a 
recombinant form of Taf polymerase typically involves the construction of an 
expression vector, the transformation of a host cell with the vector, and culture of the 

25 transformed host cell under conditions such that expression will occur. 

To construct the expression vector, a DNA is obtained that encodes the mature 
(used here to include all muteins) enzyme or a fusion of the 2af polymerase to an 
additional sequence that does not destroy activity or to an additional sequence cleavable 
under controlled conditions (such as treatment with peptidase) to give an active protein. 

30 The coding sequence is then placed in operable linkage with suitable control sequences 
in an expression vector. The vector can be designed to replicate autonomously in the 
host cell or to integrate into the chromosomal DNA of the host cell. The vector is used 
to transform a suitable host, and the transformed host is cultured under conditions 
suitable for expression of recombinant Taf polymerase. The Taf polymerase is isolated 

35 from the medium or from the cells, although recovery and purification of the protein 
may not be necessary in some instances. 
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Each of the foregoing steps can be done in a variety of ways. For example, die 
desired coding sequence may be obtained from genomic fragments and used directly in 
appropriate hosts. The construction of expression vectors operable in a variety of hosts 
is made using appropriate replicons and control sequences* as set forth generally 
5 < below. Construction of suitable vectors containing the desired coding and control 

sequences employs standard ligation and restriction techniques that are well understood 
in the art Isolated plasorids, DNA sequences, or synthesized oligonucleotides are 
cleaved, modified, and religated in the form desired. Suitable restriction sites can, if 
not normally available, be added to the ends of the coding sequence so as to facilitate 

10 ^ construction of an expression vectcwyas exemplified below. 

Site-specific DNA cleavage is performed by treating with the suitable restriction 
enzyme (or enzymes) under conditions that are generally understood in the art and 
specified by the manufacturers of commercially available restriction enzymes. See, 
e.g., New England Biolabs, Product Catalog. In general, about 1 \ig of plasmid or 

15 other DNA is cleaved by one unit of enzyme in about 20 pi of buffo" solution; in the 
examples below, an excess of restriction enzyme is generally used to ensure complete 
digestion of the DNA. Incubation times of about one to two hours at about 37°C are 
typical, although variations can be tolerated. After each incubation, protein is removed 
by extraction with phenol and chloroform; this extraction can be followed by ether 

20 extraction and recovery of the DNA from aqueous fractions by precipitation with 
ethanol. If desired, size separation of the cleaved fragments may be performed by 
polyacrylamide gel or agarose gel electrophoresis using standard techniques. See, e.g., 
Maxam s& fll, Methods in Enzvmologv. 1980, $5:499-560. 

Restriction-cleaved fragments with single-strand "overhanging" termini can be 

25 made blunt-ended (double-strand aids) by treating with the large fragment of fi. £Qii 
DNA polymerase I (Klenow) in the presence of the four deoxynucleoside triphosphates 
(dNTPs) using incubation times of about 15 to 25 minutes at 20°C to 25°C in 50 mM 
Tris, pH 7.6, 50 mM NaQ, 10 mM Mgd* 10 mM DTT, and 5 to 10 JIM dNTPs. 
The Klenow fragment fills in at 5' protruding ends, but chews back protruding 3' 

30 single strands, even though the four dNTPs are present If desired, selective repair can 
be performed by supplying only one of the, or selected, dNTPs within the limitations 
dictated by the nature of the protruding ends. After treatment with Klenow, the mixture 
is extracted with phenol/chloroform and ethanol precipitated. Similar results can be 
achieved using SI nuclease, because treatment under appropriate conditions with S 1 

35 nuclease results in hydrolysis of any single-stranded portion of a nucleic acid. 

Synthetic oligonucleotides can be prepared using the triester method of 
Matteucci £JaL, 1981, 1 Am. Chem . Soc. 10^:3185-3191, or automated synthesis 
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methods. Kinasing of single strands prior to annealing or for labeling is achieved using 
an excess, eg., approximately 10 units, of polynucleotide kinase to 0.5 jiM substrate 
in the presence of 50 mM Tris, pH 7.6, 10 mM MgQ 2 , 5 mM dithiothreitol (DTT), and 
1 to 2 uM ATP. If kinasing is for labeling of probe, the ATP will be labeled with 32p. 
5 ligations are performed in 15-30 pi volumes under the following standard 

conditions and temperatures: 20 mM Tris-O, pH 7.5, 10 mM MgCl 2 , 10 mM DTT, 33 
Ug/ml BSA, 10 mM-50 mM NaCl, and either 40 (lM ATP and 0.01-0.02 (Weiss) units 
T4 DNA ligase at 0°C (for ligation of fragments with complementary single-stranded 
ends) or 1 mM ATP and 0.3-0.6 units T4 DNA ligase at 14°C (for "blunt end" 

10 ligation). Intermolecular ligations of fragments with complementary ends are usually 
performed at 33-100 Ug/ml total DNA concentrations (5-100 nM total ends 
concentration). Intermolecular blunt end ligations (usually employing a 20-30 fold 
molar excess of linkers, optionally) are performed at 1 uM total ends concentration. 

Li vector construction, the vector fragment is commonly treated with bacterial or 

15 calf intestinal alkaline phosphatase (BAP or CHAP) to remove the 5' phosphate and 
prevent religation and reconstruction of the vector. BAP and CXAP digestion 
conditions are well known in the art, and published protocols usually accompany the 
commercially available BAP and CLAP enzymes. To recover the nucleic acid 
fragments, the preparation is extracted with phenol-chloroform and ethanol precipitated 

20 to remove the phosphatase and purify the DNA. Alternatively, religation of unwanted 
vector fragments can be prevented by restriction enzyme digestion before or after 
ligation, if appropriate restriction sites are available. 

For portions of vectors or coding sequences that require sequence 
modifications, a variety of site-specific primer-directed mutagenesis methods are 

25 available. The polymerase chain reaction (PCR) can be used to perform site-specific 
mutagenesis. In another technique now standard in the art, a synthetic oligonucleotide 
encoding the desired mutation is used as a primer to direct synthesis of a 
complementary nucleic acid sequence contained in of a single-stranded vector, such as 
pBSM13+ derivatives, that serves as a template for construction of the extension 

30 product of the mutagenizing primer. The mutagenized DNA is transformed into a host 
bacterium, and cultures of the transformed bacteria are plated and identified. The 
identification of modified vectors may involve transfer of the DNA of selected 
transformants to a nitrocellulose filter or outer membrane and the "lifts" hybridized with 
kinased synthetic mutagenic primer at a temperature that permits hybridization of an 

35 exact match to the modified sequence but prevents hybridization with the original 

unmutagenized strand. Transformants that contain DNA that hybridizes with the probe 
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are then cultured (the sequence of the DNA is generally confirmed by sequence 
analysis) and serve as a reservoir of the modified DNA. 

In the construction set forth below, correct ligations for plasmid construction 
are confirmed by first transforming £ cqK strain DG101 (ATCC 47043) or another 
5 suitable host with the ligation mixture. Successful transfoimants are selected by 
ampicillin, tetracycline or other antibiotic resistance or sensitivity or by using other 
markers, depending on die mode of plasmid construction, as is understood in the art. 
Plasmids from the transformants are then prepared according to the method of Clewell 
£t 1969, Proc. Natl . Acad ScL USA §2:1 159, optionally following 

10 chloramphenicol amplification (Clewell, 1972, JL Bacteriol . llfl:667). Another method 
for obtaining plasmid DNA is described as the "Base-Acid" extraction method at page 
1 1 of the Bethesda Research Laboratories publication Focus , volume 5, number 2, and 
very pure plasmid DNA can be obtained by replacing steps 12 through 17 of the 
protocol with CsCI/ethidium bromide ultracentrifugation of the DNA. The isolated 

15 DNA is analyzed by restriction enzyme digestion and/or sequenced by the dideoxy 
method of Sanger £tfiL, 1977, Proc. Natl Acad ScL USA 74:5463. as further 
described by Messing fit al, 198 1, Nuc . Acids Rgs. 2:309, or by the method of Maxam 
et ^L, 1980, Methods in Enzvmologv 65:499. 

The control sequences, expression vectors, and transformation methods are 

20 dependent on the type of host cell used to express the gene. Generally, procaryotic, 

yeast, insect, or mammalian cells are used as hosts. Procaryotic hosts are in general the 
most efficient and convenient for the production of recombinant proteins and are 
therefore preferred for the expression of Taf polymerase. 

The procaryote most frequently used to express recombinant proteins is £L coli . 

25 For cloning and sequencing, and for expression of constructions under control of most 
bacterial promoters, R £oH K12 strain MM294, obtained from the fi. £Sli Genetic 
Stock Center under GCSC #6135, can be used as the host For expression vectors 
with the PlNrbs or PlTtrbs control sequence, E. coH K12 strain MC1000 lambda 
lysogen, XN 7 N 53 a857 SusPso, ATCC 39531, may be used- E. coli DG1 16, which 

30 was deposited with the ATCC (ATCC 53606) on April 7, 1987, and E. coU KB2, 
which was deposited with the ATCC (ATCC 53075) on March 29, 1985, are also 
useful host cells. For M13 phage recombinants, E. eqU strains susceptible to phage 
infection, such as E. coli K12 strain DG98, are employed. The DG98 strain was 
deposited with the ATCC (ATCC 39768) on July 13, 1984. 

35 However, microbial strains other than coU can also be used, such as bacilli, 

for example Bacillus subtilis. various species of Pseudomonas, and other bacterial 
strains, for recombinant expression of Taf DNA polymerase. In such procaryotic 
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systems, plasmid vectors that contain replication sites and control sequences derived 
from the host or a species compatible with the host are typically used 

For example, g. coli is typically transformed using derivatives of pBR322, 
described by Bolivar si fll, 1977, Gene 2:95. Plasmid pBR322 contains genes for 
5 ampicillin and tetracycline resistance. These drugresistance markers can be either 
retained or destroyed in constructing die desired vector and so help to detect the 
presence of a desired recombinant Commonly used procaryotic control sequences, 
Le., a promoter for transcription initiation, optionally with an operator, along with a 
ribosome binding site sequence, include the {^lactamase (penicillinase) and lactose (lac) 

10 promoter systems (Chang SL fit. 1977, Nature 128:1056), the tryptophan (trp) - 

promoter system (Goeddel st aU 1980, Mu£- Acids figs. &4057), and the lambda* 
derived Pl promoter (Shimatake £j 1981, Nature 292:1281 and N-gene ribosome 
binding site (Nrbs)- A portable control system cassette is set forth in United States 
Patent No. 4,71 1,845, issued December 8, 1987. This cassette comprises a P L 

15 promoter operably linked to the Nrbs in turn positioned upstream of a third DNA 

sequence having at least one restriction site that permits cleavage within six bp 3' of the 
Nrbs sequence. Also useful is the phosphatase A (phoA) system described by Chang 
££fiLin European Patent Publication No. 196,864, published October 8, 1986, 
However, any available promoter system compatible with procaryotes can be used to 

20 construct a Taf expression vector of the invention. 

In addition to bacteria, eucaryotic microbes, such as yeast, can also be used as 
recombinant host cells. Laboratory strains of Saccharomvces oerevisiae. Baker's yeast, 
are most often used, although a number of other strains are commonly available. While 
vectors employing the two micron origin of replication are common (Broach, 1983, 

25 Meth. Enz. 101:307), other plasmid vectors suitable for yeast expression are known 
(see, for example, StinchcombfiiflL, 1979. Nature 282 :39: Tschempe&filM 1980, 
<3sn£lfi:157; and Clarke staL, 1983, Meth. Eng. 1Q1:30Q). Control sequences for 
yeast vectors include promoters for the synthesis of glycolytic enzymes (Hess ££ 2L. 
1968, 1 Adv. Enzyme Reg. 2:149; Holland £j jJ., 1978. Biotechnology 17:4900: and 

30 Holland £t fiL, 1981, £ BioL Chem. 256:1385), Additional promoters known in the 
art include the promoter for 3-phosphoglycerate kinase (Hitzeman gt 1980, £ BioL 
Chem. 225:2073) and those for other glycolytic enzymes, such as glyceraldehyde 3- 
phosphate dehydrogenase, hexokinase, pyruvate decarboxylase, phosphofructokinase, 
glucose-6-phosphate isomerase, 3-phosphoglycerate mutase, pyruvate kinase, 

35 triosephosphate isomerase, phosphogiucose isomerase, and glucokinase. Other 
promoters that have the additional advantage of transcription controlled by growth 
conditions are the promoter regions for alcohol dehydrogenase 2, isocytochrome C, 
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acid phosphatase, dcgradativc enzymes associated with nitrogen metabolism, and 
enzymes responsible for maltose and galactose utilization (Holland, gupraV 

Tecminator sequences may also be used to enhance expression when placed at 
the 3 1 end of the coding sequence. Such terminators are found in the 3 V untranslated 
5 region following the coding sequences in yeast-derived genes. Any vector containing a 
yeast-compatible promoter, origin of replication, and other control sequences is suitable 
for use in constructing yeast Taf expression vectors. 

The Jai gene can also be expressed in eucaryotic host cell cultures derived from 
multicellular organisms. See, for example, Tissue Culture. Academic Press, Cruz and 
10 Patterson, editors (1973). Useful host cell Hues include COS-7, COS-A2, CV-L % 
murine cells such as murine myelomas NS1 and VERO, HeLa cells, and Chinese 
hamster ovary (CHO) cells. Expression vectors for such cells ordinarily include 
promoters and control sequences compatible with mammalian cells such as, for 
example, the commonly used early and late promoters from Simian Virus 40 (SV 40) 
15 (Fiers £t aL, 1978, Nature 273 :1 13), or other viral promoters such as those derived 

from polyoma, adenovirus 2, bovine papilloma virus (BPV), or avian sarcoma viruses, 
or immunoglobulin promoters and heat shock promoters, A system for expressing 
DNA in mammalian systems using a BPV vector system is disclosed in United States 
Patent No. 4,419,446. A modification of this system is described in United States 
20 Patent No. 4,601,978. General aspects of mammalian cell host system transformations 
have been described by Axel, United States Patent No. 4,399,216. "Enhancer" regions 
are also important in optimizing expression; these are, generally, sequences found 
upstream of the promoter region. Origins of replication may be obtained, if needed, 
from viral sources. However, integration into the chromosome is a common 
25 mechanism for DNA replication in eucaryotes. 

Plant cells can also be used as hosts, and control sequences compatible with 
plant cells, such as die nopaline synthase promoter and polyadenylation signal 
sequences (Dcpicker £$sL* 1982, I. Mol . Appl, Gen, 1:561) are available. Expression 
systems employing insect cells utilizing the control systems provided by baculovirus 
30 vectors have also been described (Miller g| gL, in Genetic Engineering (1986), Setlow 
■fitflLt eds-i Plenum Publishing, VoL 8, pp. 277-297). Insect cell-based expression 
can be accomplished in Spodoptera frupipeida . These systems are also successful in 
producing recombinant Taf polymerase. 

Depending on die host cell used, transformation is done using standard 
35 techniques appropriate to such cells. TTie calcium treatment employing calcium 

chloride, as described by Cohen, 1972, Proa Natl. Acad. ScL USA ^9:21 10 is used 
for procaryotes or other cells that contain substantial cell wall barriers. Infection with 
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Agrobacter ium tnmefaciens (Shaw et aL. 1983, ^^n^ 21:3 15) is used for certain plant 
cells. Far mammalian cells, the calcium phosphate precipitation method of Graham and 
van der Eb, 1978, Virology 52:546 is preferred. Transformations into yeast are carried 
out according to die method of Van Solingen £t aU 1977, £ Bact 130:946. and Hsiao 
5 £t flL, 1979, Proc. Nad. Acad, Sci . USA 76:3829. 

Once the Taf DNA polymerase has been expressed in a recombinant host cell; 
purification of the protein may be desired. Although a variety of purification 
procedures can be used to purify die recombinant thermostable polymerase of the 
invention, fewer steps may be necessary to yield an enzyme preparation of equal purity. 

10 Because E coli host proteins are heat-sensitiyejh? recombinant thermostable Tgf DNA 
polymerase can be substantially enriched by heat inactivating the crude lysate. This 
step is done in die presence of a sufficient amount of salt (typically 03 M ammonium 
sulfate) to ensure dissociation of die 2M DNA polymerase from the host DNA and to 
reduce ionic interactions ofTafDNA polymerase with other cell lysate proteins. 

15 In addition, the presence of 03 M ammonium sulfate promotes hydrophobic 

interaction with a phenyl sepharose column. Hydrophobic interaction chromatography 
is a separation technique in which substances are separated on the basis of differing 
strengths of hydrophobic interaction with an uncharged bed material containing 
hydrophobic groups. Typically, the column is first equilibrated under conditions 

20 favorable to hydrophobic binding, such as high ionic strength. A descending salt 
gradient may then be used to elute the sample. 

According to the invention, an aqueous mixture (containing either native or 
recombinant Taf DNA polymerase) is loaded onto a column containing a relatively 
strong hydrophobic gel such as phenyl sepharose (manufactured by Pharmacia) or 

25 Phenyl TSK (manufactured by Toyo Soda). To promote hydrophobic interaction with 
a phenyl sepharose column, a solvent is used which contains, for example, greater than 
or equal to 03 M ammonium sulfate. The column and the sample are adjusted to 03 M 
ammonium sulfate in 50 mM Tris (pH 7S) and 5 mM EDTA ("TE") buffer that also 
contains 0.5 mM DTT, and the sample is applied to the column. The column is washed 

30 with the 03 M ammonium sulfate buffer. The enzyme may then be eluted with 

solvents which attenuate hydrophobic interactions, such as decreasing salt gradients, or 
increasing gradients or addition of ethylene or propylene glycol, or urea. For native 
Taf DNA polymerase, a preferred embodiment involves washing the column with 2 M 
urea in 20% ethylene glycol in TE-DTT wash. 

35 For long-term stability, Taf DNA polymerase enzyme can be stored in a buffer 

that contains one or more non-ionic polymeric detergents. Such detergents are 
generally those that have a molecular weight in the range of approximately 100 to 
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250,000 daltons, preferably about 4,000 to 200,000 daltons and stabilize die enzyme at 
apHofftomabout3.5toabout9.5,pi^erablyfiro^ Examples of such 

detergents include those specified on pages 295*298 of McCuteheorfs Emulsifiers & 
Detergents. North American edition (1983), published by the McCutcheon Division of 
5 MC Publishing Co., 175 Rock Road, Glen Rock, NJ (USA). 

Preferably, die detergents are selected from the group comprising ethoxylated 
fatty alcohol ethers and lauryl ethers, ethoxylated alkyl phenols, octylphenoxy 
polyethoxy ethanol compounds, modified oxyethylated and/or oxypropylated straight- 
chain alcohols, polyethylene glycol monooleate compounds, polysorbate compounds, 

10 and phenolic fatty alcohol ethers. More particularly preferred are Tween 20, a 

polyoxyethylated (20) sorbitan monolaurate from ICI Americas Inc., Wilmington, 
D.E., and Iconol NP-40, an ethoxylated alkyl phenol (nonyl) from BASF Wyandotte 
Corp. Parsippany, NJ. 

Hie thermostable enzyme of this invention may be used for any purpose in 

15 which such enzyme activity is necessary or desired. In a particularly preferred 

embodiment, the enzyme catalyzes the nucleic acid amplification reaction known as 
PGR. Th process for amplifying nucleic acid sequences is disclosed and claimed in 
United States Patent No. 4,683,202, issued July 28, 1987, the disclosure of which is 
incorporated herein by reference. The PCR nucleic acid amplification method involves 

20 amplifying at least one specific nucleic acid sequence contained in a nucleic acid or a 
mixture of nucleic acids and in the most common embodiment, produces double- 
stranded DNA. 

For ease of discussion, the protocol set forth below assumes that the specific 
sequence to be amplified is contained in a double-stranded nucleic acid. However, the 

25 process is equally useful in amplifying single-stranded nucleic acid, such as mRNA, 
although in the preferred embodiment the ultimate product is still double-stranded 
DNA. In the amplification of a single-stranded nucleic acid, the first step involves the 
synthesis of a complementary strand (one of the two amplification primers can be used 
for this purpose), and the succeeding steps proceed as in the double-stranded 

30 amplification process described below. 

This amplification process comprises the steps of: 
(a) contacting each nucleic acid strand with four different nucleoside 
triphosphates and one oligonucleotide primer for each strand of the specific sequence 
being amplified, wherein each primer is selected to be substantially complementary to 

35 the different strands of the specific sequence, such that the extension product 

synthesized from one primer, when it is separated from its complement, can serve as a 
template for synthesis of the extension product of the other primer, said contacting 
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being at a temperature which allows hybridization of each prima to a complementary 
nucleic acid strand; 

(b) contacting each nucleic add strand; at the same time as or after step (a) , 
with a DNA polymerase fcomTaf which enables combination of the nucleoside 

5 triphosphates to form primer extension products complementary to each strand of the 
specific nucleic acid sequence; 

(c) maintaining the mixture from step (b) at an effective temperature for an 
effective time to promote the activity of the enzyme and to synthesize, for each different 
sequence being anqriified, an extension product of each prima which is complementary 

10 to each nucleic acid strand template, but not so high as to separate each extension 
product from the complementary strand template; 

(d) heating the mixture from step (c) for an effective time and at an effective 
temperature to separate the primer extension products from the templates on which they 
were synthesized to produce single-stranded molecules but not so high as to denature 

15 irreversibly the enzyme; 

(e) cooling the mixture from step (d) for an effective time and to an effective 
temperature to promote hybridization of a primer to each of the single-stranded 
molecules produced in step (d); and 

(f) maintaining the mixture from step (e) at an effective temperature for an 
20 effective time to promote the activity of the enzyme and to synthesize, for each different 

sequence being amplified, an extension product of each primer which is complementary 
to each nucleic add template produced in step (d) but not so high as to separate each 
extension product from the complementary strand template. The effective times and 
temperatures in steps (e) and (f) may coincide, so that steps (e) and (f) can be carried 
25 out simultaneously. Steps (dHO are repeated until the desired level of amplification is 
obtained. 

The amplification method is useful not only for producing large amounts of a 
specific nucleic arid sequence of known sequence but also for producing nucleic acid 
sequences which are known to exist but are not completely specified. One need know 

30 only a sufficient number of bases at both ends of the sequence in sufficient detail so that 
two oligonucleotide primers can be prepared which will hybridize to different strands of 
the desired sequence at relative positions along the sequence such that an extension 
product synthesized from one primer, when separated from the template (complement), 
can serve as a template for extension of the other primer. The greater the knowledge 

35 about the bases at both ends of the sequence, the greater can be the specificity of the 
primers for the target nucleic acid sequence. 
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In any case, an initial copy of the sequence to be amplified must be available, 
although the sequence need not be pure or a discre^ In general, the 

amplification process involves a chain reaction for producing at least one specific 
nucleic acid sequence, called the "target" sequence, given that (a) the ends of the target 
5 sequence are known in sufficient detail that oligonucleotides can be synthesized which 
will hybridize to than, and (b) a small amount of die sequence is available to initiate the 
chain reaction. The product accumulates exponentially relative to the number of 
reaction steps involved. The product of the chain reaction is a discrete nucleic duplex 
with termini corresponding to the ends of the specific primers employed. 

10 Any nucleic acid sequence, in purified or nonpurified form, can be utilized as 

the starting nucleic acid(s), provided it contains or is suspected to contain the specific 
nucleic acid sequence desired The nucleic acid to be amplified can be obtained from 
any source, for example, from plasmids such as pBR322, from cloned DNA or RNA, 
from natural DNA or RNA from any source, including bacteria, yeast, viruses, 

15 organelles, and higher organisms such as plants and animals, or from preparations of 
nucleic acid made in vitro, DNA or RNA may be extracted from blood, tissue material 
such as chorionic villi, or amniotic cells by a variety of techniques. See, e.g., Maniatis 
£t ak 1982, Molecular Cloning: A Laboratory Manual (Cold Spring Harbor 
Laboratory, Cold Spring Harbor, New York) pp. 280-281. Thus, the process may 

20 employ, for example, DNA or RNA, including messenger RNA, which DNA or RNA 
may be single-stranded or double-stranded In addition, a DNA-RNA hybrid which 
contains one strand of each may be utilized. A mixture of any of these nucleic acids can 
also be employed as can nucleic acids produced from a previous amplification reaction 
(using the same or different primers). The specific nucleic acid sequence to be 

25 amplified may be only a fraction of a large molecule or can be present initially as a 
discrete molecule, so that the specific sequence constitutes die entire nucleic acid. 

The sequence to be amplified need not be present initially in a pure form; the 
sequence can be a minor fraction of a complex mixture, such as a portion of the 
p-globin gene contained in whole human DNA (as exemplified in Sailri £t fil., 1985, 

30 Science 230: 1530-1534) or a portion of anucleic acid sequence due to a particular 
microorganism, which organism might constitute only a very minor fraction of a 
particular biological sample. The cells can be directly used in the amplification process 
after suspension in hypotonic buffer and heat treatment at about 90°-100°C until cell 
lysis and dispersion of intracellular components occur (generally 1 to 15 minutes). 

35 After the heating step, the amplification reagents may be added directly to the lysed 
cells. The starting nucleic acid sequence may contain more than one desired specific 
nucleic acid sequence. The amplification process is useful not only for producing large 
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amounts of one specific nucleic add sequence but also for amplifying simultaneously 
mote than one different specific nucleic acid sequence located on the same or different 
nucleic acid molecules. 

Primers play a key role in the PCR process. The word "primer" as used in 
5 describing the amplification process can refer to more than one primer, particularly in 
the case where there is some ambiguity in the information regarding the terminal 
sequenced) of the fragment to be amplified For instance, in the case where a nucleic 
acid sequence is inferred from protein sequence information, a collection of primers 
containing sequences representing all possible codon variations based on degeneracy of 

10 the genetic code will be used for each strand A least one prima- finom this collection 
will be sufficiently homologous with the end of the desired sequence to be amplified to 
be useful for amplification. 

In addition, more than one specific nucleic acid sequence can be amplified from 
the first nucleic acid or mixture of nucleic adds, so long as the appropriate number of 

15 different oligonucleotide primers are utilized. For example, if two different specific 
nucleic acid sequences are to be produced, four primers are utilized. Two of the 
primers are specific for one of the specific nucleic acid sequences and the other two 
primers are specific for the second specific nucleic acid sequence. In this manner, each 
of the two different specific sequences can be produced exponentially by the present 

20 process. When allelic variants or different members of a multigene family are to be 

amplified, however, one can often amplify several different sequences with a single set 
of primers. 

A sequence within a given sequence can be amplified after a given number of 
amplifications to obtain greater specificity of the reaction by adding after at least one 

25 cycle of amplification a set of primers that are complementary to internal sequences (that 
are not on the ends) of the sequence to be amplified. Such primers may be added at any 
stage and will provide a shorter amplified fragment Alternatively, a longer fragment 
can be prepared by using primers with non-complementary 5* ends but having some 3' 
overlap with the 5' ends of the primers previously utilized in the amplification. 

30 Primers also play a key role when the amplification process is used for in vitro 

mutagenesis. Hie product of an amplification reaction where die primers employed are 
not exactly complementary to the original template will contain the sequence of die 
primer rather than the template, so introducing an in vitro mutation. Although the initial 
cycles may be somewhat inefficient, due to the mismatch between the mutagenic primer 

35 and the target, in further cycles the mutation will be amplified with an undiminished 

efficiency because no further mispaired priming is required. The process of making an 
altered DNA sequence as described above could be repeated on the altered DNA using 
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different primers to induce farther sequence changes. In this way, a series of mutated 
sequences can gradually be produced wherein each new addition to die series differs 
ftom the last in a minor way, but from the original DNA source sequence in an 
increasingly major way. 
5 Because the primer can contain as part of its sequence a non-complementary 

sequence, provided that a sufficient amount of the prima: contains a sequence that is' 
complementary to the strand to be amplified, many other advantages can be realized. 
For example, a nucleotide sequence that is not complementary to the template sequence 
(such as, e.g., a promoter, linker, coding sequence, etc.) may be attached at the 5" end 

10 of one or both of the primers and so appended to the product of the amplification 

process. After the extension primer is added, sufficient cycles are run to achieve the 
desired amount of new template containing the non-complementary nucleotide insert 
This allows production of large quantities of die combined fragments in a relatively 
short period of time (e.g. t two hours or less) using a simple technique. 

15 Oligonucleotide primers can be prepared using any suitable method, such as, 

for example, the phosphotriester and phosphodiester methods or automated 
embodiments thereof. The phosphotriester method is described in Narang £t aL* 1979, 
Meth. Enzvmol. £8:90, and U.S. Patent No. 4,356,270. The phosphodiester method 
is described in Brown £t flL» 1979, Meth . Enzvmol . 6& 109. In one such automated 

20 embodiment, diethylphosphoramidites are used as starting materials and may be 

synthesized as described by Beaucage £t aL, 1981- Tetrahedron Letters 22:1859-1862. 
One method for synthesizing oligonucleotides on a modified solid support is described 
in United States Patent No. 4,458,066. One can also use a primer that has been 
isolated from a biological source (such as a restriction endonuclease digest). 

25 To produce a specific nucleic add sequence using PGR, a nucleic acid 

containing that sequence is used as a template. The first step involves contacting each 
nucleic acid strand with four different nucleoside triphosphates and one oligonucleotide 
primer for each strand of each specific nucleic acid sequence being amplified or 
detected. If the nucleic acids to be amplified or detected are DNA, then the nucleoside 

30 triphosphates are usually dATP, dCTP, dGTP, and dTTP, although various nucleotide 
derivatives can also be used in the process. The concentration of nucleoside 
triphosphates can vary widely. Typically the concentration is 50-200 pM of each 
dNTP in the buffer for amplification, and MgCk is present in the buffer in an amount 
of 1 to 3 mM to activate the polymerase and increase the specificity of the reaction. 

35 However, dNTP concentrations of 1-20 nM may be preferred for some applications, 
such as DNA sequencing or labeling PCR products at high specific activity. 
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The nucleic acid strands of die target nucleic acid sexwt as tem pl ates for the 
synthesis of additional nucleic acid strands, which are extension products of the 
primers. This synthesis can be performed using any suitable method, but generally 
occurs in a buffered aqueous solution, preferably at a pH of 7 to 9, most preferably 
5 about 8. To facilitate synthesis, a molar excess of the two oligonucleotide primers is 
added to the buffer containing the template strands. As a practical matter, the amount of 
primer added will generally be in molar excess over the amount of complementary 
strand (template) when the sequence to be amplified is contained in a mixture of 
complicated long-chain nucleic acid strands. A large molar excess is preferred to 
10 improve the efficiency of the process.. Accordingly, primentemplate ratios of about 
1000:1 are generally employed for cloned DNA templates, and primer: template ratios 
of about 10«:1 are generally employed for amplification from complex genomic 
samples. 

The mixture of template, primers, and nucleoside triphosphates is then treated 

15 according to whether the nucleic acids being amplified or detected are double- or single- 
stranded. If the nucleic acids are single-stranded, then no denaturation step need be 
employed, and the reaction mixture is held at a temperature which promotes 
hybridization of the primer to its complementary target (template) sequence. Such 
temperature is generally from about 35°C to 65°C or more, preferably about 37-60°C 

20 for an effective time, generally from a few seconds to five minutes, preferably mom 30 
seconds to one minute. A hybridization temperature of 35-80°C may be used for Taf 
DNA polymerase, and 15-mer or longer primers are used to increase the specificity of 
primer hybridization. Shorter primers require lower hybridization temperatures or 
agents which stabilize double-stranded DNA 

25 The complement to the original single-stranded nucleic acids can be synthesized 

by adding Jjf DNA polymerase in the presence of the appropriate buffer, dNTPs, and 
one or more oligonucleotide primers. If an appropriate single primer is added, the 
primer extension product will be complementary to the single-stranded nucleic acid and 
will be hybridized with the nucleic acid strand in a duplex of strands of equal or 

30 unequal length (depending where the primer hybridizes on the template), which may 
then be separated into single strands as described above to produce two single, 
separated, complementary strands. Alternatively, two or mare appropriate primers (one 
of which will prime synthesis using the extension product of the other primer as a 
template) may be added to the single-stranded nucleic acid and the reaction carried but. 

35 If the nucleic acid contains two strands, as in the case of amplification of a 

double-stranded target or second-cycle amplification of a single-stranded target, the 
strands of nucleic acid must be separated before the primers are hybridized. This strand 
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separation can be accomplished by any suitable denaturing method, including physical, 
chemical or enzymatic means. One preferred physical method of separating die strands 
of the nucleic add involves heating the nucleic add until complete (>99%) denaturation 
occurs. Topical heat denaturation involves temperatures ranging from about 90° to 
5 105°C for times generally ranging from about a few seconds to 4 minutes, depending 
on the composition and size of die nucleic acid Preferably, the effective denaturing 
temperature is 90°-100°C for a few seconds to 1 minute. Strand separation may also be 
induced by an enzyme from the class of enzymes known as helicases or the enzyme 
RecA, which has Micase activity and in the presence of ATP is known to denature 

10 DNA* The reaction conditions suitable for separating the^strands of nucleic adds with 
helicases are described by Kuhn Hoffmann-Beriing, 1978, CSH-Ouantitative Biology 
4^:63, and techniques for using RecA are reviewed in Radding, 1982, Ann . Rev. 
Genetics 16:405-437. The denaturation produces two separated complementary strands 
of equal or unequal length. 

IS If the double-stranded nucleic add is denatured by heat, the reaction mixture is 

allowed to cool to a temperature which promotes hybridization of each prima to the 
complementary target (template) sequence. This temperature is usually from about 
35°C to 65°C or more, depending on reagents, preferably 37°-60°C. The hybridization 
temperature is maintained for an effective time, generally 30 seconds to 5 minutes, and 

20 preferably 1-3 minutes. In practical terms, the temperature is simply lowered from 
about 95°C to as low as 37°C, and hybridization occurs at a temperature within this 
range. 

Whether the nucleic add is single- or double-stranded, the DNA polymerase 
from Taf may be added at the denaturation step or when the temperature is being 

25 reduced to or is in the range for promoting hybridization. Although the thermostability 
of Taf polymerase allows one to add Taf polymerase to the reaction mixture at any time, 
erne can substantially inhibit non-specific amplification by adding the polymerase to the 
reaction mixture at a point in time when die mixture will not be cooled below die 
stringent hybridization temperature. After hybridization, the reaction mixture is then 

30 heated to or maintained at a temperature at which the activity of the enzyme is promoted 
or optimized, Le,, a temperature suffident to increase the activity of the enzyme in 
facilitating synthesis of die primer extension products from the hybridized primer and 
template. The temperature must actually be suffident to synthesize an extension 
product of each prima' which is complementary to each nucleic add template, but must 

35 not be so high as to denature each extension product from its complementary template 
(Le., die temperature is generally less than about 80°-90°Q. 
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Depending on the nucleic acid(s) employed, die typical temperature effective for 
this synthesis reaction generally ranges from about 40° to 80°C, preferably 50°-75°Q 
The temperature more preferably ranges from about 65°-75°C forTafDNA polymerase. 
The period of time required for this synthesis may range from several seconds to 40 
5 minutes or more, depending mainly on the temperature, the length of the nucleic acid, 
the enzyme, and the complexity of the nucleic acid mixture. The extension time is 
usually about 30 seconds to three minutes. If the nucleic acid is longer, a longer time 
period is generally required for complementaiy strand synthesis* The newly 
synthesized strand and the complement nucleic acid strand form a double-stranded 

10 molecule which is used in the succeeding steps of the amplified 

In the next step, the strands of the double-stranded molecule are separated by 
heat denaturation at a temperature and for a time effective to denature the molecule, but 
not at a temperature and for a period so long that the thermostable enzyme is completely 
and irreversibly denatured or inactivated After this denaturation of template, the 

15 temperature is decreased to a level which promotes hybridization of the primer to the 
complementary single-stranded molecule (template) produced from the previous step, 
as described above. 

After this hybridization step, or concurrently with the hybridization step, the 
temperature is adjusted to a temperature that is effective to promote the activity of the 

20 thermostable enzyme to enable synthesis of a primer extension product using as a 
template both the newly synthesized and the original strands. The temperature again 
must not be so high as to separate (denature) the extension product from its template, as 
described above. Hybridization may occur during this step, so that the previous step of 
cooling after denaturation is not required In such a case, using simultaneous steps, the 

25 preferred temperature range is 50°-70°G 

The heating and cooling steps involved in one cycle of strand separation, 
hybridization, and extension product synthesis can be repeated as often as needed to 
produce the desired quantity of the specific nucleic add sequence. The only limitation 
is the amount of the primers, thermostable enzyme, and nucleoside triphosphates 

30 present Usually, from 15 to 30 cycles are completed. For diagnostic detection of 
amplified DNA, the number of cycles will depend on the nature of the sample and the 
sensitivity of the detection process used after amplification. If the sample is a complex 
mixture of nucleic acids, more cycles will usually be required to amplify the signal 
sufficiently for detection. For general amplification and detection, the process is 

35 repeated about IS times. When amplification is used to generate sequences to be 

detected with labeled sequence-specific probes and when human genomic DNA is the 
target of amplification, the process is usually repeated 15 to 30 times to amplify the 
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sequence sufficiently that a clearly detectable signal is produced, Le., so that 
background noise does not interfere with detection. 

No additional nucleotides, primers, or thermostable enzyme need be added after 
the initial addition, provided that no key reagent has been exhausted and that the 
5 enzyme has not become denatured or irreversibly inactivated, in which case additional 
polymerase or other reagent would have to be added for the reaction to continue. After 
the appropriate number of cycles has been completed to produce the desired amount of 
the specific nucleic acid sequence, the reaction may be halted in the usual manner, eg., 
by inactivating the enzyme by adding EDTA, phenol, SDS, or CHCI3 or by separating 

10 the components of the reaction. -=.± 

The amplification process may be conducted continuously. In one embodiment 
of an automated process, the reaction mixture may be temperature cycled such that the 
temperature is programmed to be controlled at a certain level for a certain time. One 
such instrument for this purpose is the automated machine for handling the 

IS amplification reaction developed and marketed by Perkin-Elmer Getus Instruments. 
Detailed instructions for carrying out PCR with the instrument are available upon 
purchase of the instrument 

Taf DNA polymerase is very useful in the diverse processes in which 
amplification of a nucleic acid sequence by the polymerase chain reaction is useful. The 

20 amplification method may be utilized to clone a particular nucleic acid sequence for 
insertion into a suitable expression vector, as described in United States Patent No. 
4,800,159. The vector may be used to transform an appropriate host organism to 
produce the gene product of the sequence by standard methods of recombinant DNA 
technology. Such cloning may involve direct ligation into a vector using blunt-end 

25 ligation, or use of restriction enzymes to cleave at sites contained within the primers or 
amplified target sequences. Other processes suitable for Taf polymerase include those 
described in United States Patent Nos. 4,683,194; 4,683,195; and 4,683,202 and 
European Patent Publication Nos. 229,701; 237,362; and 258,017; these patents and 
publications are incorporated herein by reference. In addition, the present enzyme is 

30 useful in asymmetric PCR (see Gyllensten and Erlich, 1988, Proc. Natl. Acad. SeL 
USA £5:7652-7656, incorporated herein by reference); inverse PGR (Ochmanfit aL, 
1988. Genetics 120 :621. incorporated herein by reference); and for DNA sequencing 
(see Innis £L aL, 1988, Proc . Natl . Acad . ScL USA 55:9436-9440, and McConlogue £t 
fiL. 1988, Nuc. Acids Res. JL6(20):9869). IM polymerase is also believed to have 

35 reverse transcriptase activity, (see PCT publication WO 91/09944, which is 

incorporated herein by reference), and 5*-»3' exonuclease activity (also known as 
structure dependent single strand endonuclease (SDSSE) activity). 
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The reverse transcriptase activity of the 2M DNA polymerase pennits this 
enzyme to be used in methods for transcribing and amplifying RNA. The improvement 
of such methods resides in die use of a single enzyme, whereas previous methods have 
required more than one enzyme* 
5 The improved methods comprise die steps of: (a) combining an RNA template 

with a suitable primer under conditions whereby die primer will anneal to the 
corresponding RNA template; and (b) reverse transcribing the RNA template by 
incubating the annealed primer-RNA template mixture with Taf DNA polymerase under 
conditions sufficient for the DNA polymerase to catalyze die polymerization of 

10 deoxyribonucleotide triphosphates to form a DNA sequence complementary to the - 
sequence of the RNA template. 

In another aspect of the above method, the primer which anneals to the RNA 
template may also be suitable for use in a PCR amplification. In PCR, a second primer 
which is complementary to the reverse transcribed cDNA strand provides a site for 

15 initiation of synthesis of an extension product As already discussed above, the Taf 
DNA polymerase is able to catalyze this extension reaction on the cDNA template. 

In the amplification of an RNA molecule by Taf DNA polymerase, the first 
extension reaction is reverse transcription, and a DNA strand is produced as an 
RNA/cDNA hybrid molecule. The second extension reaction, using the DNA strand as 

20 a template, produces a double-stranded DNA molecule. Thus, synthesis of a 

complementary DNA strand from an RNA template with Taf DNA polymerase provides 
the starting material for amplification by PCR. 

When Jaf DNA polymerase is used for reverse transcription from an RNA 
template, buffers which contain Mn*+ may provide improved stimulation of Taf reverse 

25 transcriptase activity compared to Mg*+ - containing reverse transcription buffers. 
Consequently, increased cDNA yields may also result from these methods. 

As stated above, the product of RNA reverse transcription by Taf DNA 
polymerase is an RNA/cDNA hybrid molecule. The RNA can be removed or separated 
fiorn the cDNA by heat denaturation or any number of other known methods including 

30 alkali, heat or enzyme treatment The remaining cDNA strand then serves as a template 
for polymerization of a complementary strand, thereby providing a means for obtaining 
a double-stranded cDNA molecule suitable for amplification or other manipulation. The 
second strand synthesis requires a sequence specific prima: and Taf DNA polymerase. 
Following the synthesis of the second cDNA strand, the resultant double- 

35 stranded cDNA molecule can serve a number of purposes including DNA sequencing, 
amplification by PCR or detection of a specific nucleic acid sequence. Specific primers 
useful for amplification of a segment of the cDNA can be added subsequent to the 
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reverse transcription. Also, it may be desirable to use a first set of primers to 
synthesize a specific cDNA molecule and a second nested set of primers to amplify a 
desired cDN A segment All of these reactions axe catalyzed by Js£ DNA polymerase. 
Taf DNA polymerase may also be used to simplify and improve methods for 
5 detection of RNA target molecules in a sample. In these methods, TkfDNA 

polymerase catalyzes: (a) reverse transcription; (b) second strand cDNA synthesis; 
and, if desired (c) amplification by PGR. The use of Taf DNA polymerase in the 
described methods eliminates die previous requirement of two sets of incubation 
conditions which were necessary due to the use of different enzymes for each step. 

10 The use of Taf DNA polymerase provides RNA reverse transcription and amplification 
of die resulting complementary DNA with enhanced specificity and with fewer steps 
than previous RNA cloning and diagnostic methods. These methods are adaptable for 
f use in laboratory or clinical analysis, and kits for making such analysis simple to 
perform are an important aspect of the present invention. 

15 The RNA which is reverse transcribed and amplified in the above methods can 

be derived from a number of sources. The RNA template may be contained within a 
nucleic acid preparation from an organism such as a viral or bacterial nucleic acid 
preparation. The preparation may contain cell debris and other components, purified 
total RNA or purified mRNA. The RNA template may also be a population of 

20 heterogeneous RNA molecules in a sample. Furthermore, the target RNA may be 
contained in a biological sample, and the sample may be a heterogeneous sample in 
which RNA is but a small portion thereof. Examples of such biological samples 
include blood samples and biopsied tissue samples. 

Although the primers used in the reverse transcription step of the above 

25 methods are generally completely complementary to the RNA template, they need not 
be. As in PCR, not every nucleotide of the primer must be complementary to the 
template for reverse transcription to occur. For example, a non-complementary 
nucleotide sequence may be present at the 5' end of the primer with the remainder of the 
primer sequence being complementary to the RNA. Alternatively, non-complementary 

30 bases can be interspersed into the primer, provided that the primer sequence has 

sufficient complement nty with die RNA template for hybridization to occur and allow 
synthesis of a comple* ^ntary DNA strand. 

The structure dependent single stranded endonuclease (SDSSE) activity of T^f 
DNA polymerase I may limit the amount of product produced by PCR, thus creating a 

35 plateau phenomenon in the normally exponential accumulation of product. The SDSSE 
activity may also limit the size of the PCR product produced and the ability to generate 
PCR product from GC-rich target template. However, SDSSE activity can also be 
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helpful; see PCT Publication No. t based on PCT Application No, 91/05591, 

filed August 6, 1991, and incorporated herein by reference. SDSSE activity relates to 
the hydrolysis of phosphodiester bonds. SDSSE activity generally excises 5 4 terminal 
regions of double-stranded DNA, thereby releasing 5-mono- and oligonucleotides. 
5 The preferred substrate for the SDSSE activity is displaced single-stranded DNA, with 
hydrolysis of the phosphodiester bond which occurs between the displaced single- 
stranded DNA and the double-stranded DNA The cleavage site is a phosphodiester 
bond in the double-stranded region. 

Site-directed mutagenesis or deletion mutagenesis may be utilized to eliminate 

10 the SDSSE activity of a polymerase having such activity. ■. For example,^, site-directed 
mutation of G to A in the second position of the codon for Gly at residue 46 in the Tag 
DNA polymerase coding sequence has been found to result in an approximately >1,000- 
fold reduction of SDSSE activity in die protein encoded by the sequence with no 
apparent change in polymerase activity, processivity or extension rate. This site- 

15 directed mutation of the Tag DNA polymerase nucleotide sequence results in an amino 
acid change of Gly (46) to Asp. Glycine 46 is conserved in Thermosipho africanus 
DNA polymerase, but is present at codon 37, and the same Gly to Asp mutation would 
have a similar effect on Tgf SDSSE activity. 

Gly 46 is found in a conserved AVYGF sequence domain in Tag DNA 

20 polymerase; the sequence AVYGL contains the Gly (37) ofTafDNA polymerase. 
Changing the glycine to aspartic add within this conserved sequence domain will 
reduce or eliminate the SDSSE activity. In addition, a deletion of all amino terminal 
amino acids up to and including the glycine in the AVYGF/L domain will also reduce or 
eliminate the SDSSE activity of any thermostable DNA polymerase having this 

25 sequence domain, including the DNA polymerase of Taf. 

One property found in the TM DNA polymerase, but lacking in native Tag DNA 
polymerase and native Tth DNA polymerase, is 3'-»5' exonuclease activity. This 
3 f — »5' exonuclease activity is generally considered desirable in certain applications, 
because misincorporated or unmatched bases of the synthesized nucleic acid sequence 

30 are eliminated by this activity. Therefore, die fidelity of PCR utilizing a polymerase 
with y-^S' exonuclease activity (e.g. TafDNA polymerase) may be increased. The 
3'— >5* exonuclease activity found in Taf DNA polymerase also decreases the 
probability of the formation of primer/dimer complexes in PCR. The 3'— >5* 
exonuclease activity in effect prevents any extra dNTPs from attaching to the 3* aid of 

35 the primer in a non-template dependent fashion by removing any nucleotide that is 
attached in a non-template dependent fashion. Hie 3^5' exonuclease activity can 
eliminate single-stranded DNAs, such as primers or single-stranded template. In 
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essence, every 3-nuclcotide of a single-stranded primer or template is treated by the 
enzyme as unmatched and is therefore degraded. To avoid primer degradation in PCR, 
one can add phosphorothioate to the 3 V ends of the primers. Phosphorothioate 
modified nucleotides are more resistant to removal by 3 f — »5' exonucleases. 
5 "Domain shuffling" or construction of "thermostable chimeric DNA 

polymerases" may be used to provide thermostable DNA polymerases containing novel 
properties. For example, substitution of the Taf DNA polymerase coding sequence 
comprising the 3'-*5 9 exonuclease domain for the Thermus aquaticus DNA 
polymerase I codons 289-422 would yield a novel thermostable DNA polymerase 

10 containing the 5 f — »3' exonuclease domain of Tag DNA polymerase (1-289^ the 3*— »5' 
exonuclease domain of Taf DNA polymerase, and the DNA polymerase domain ofTaq 
DNA polymerase (423-832), Alternatively, the S 1 -^ 1 exonuclease domain and the 
3'— »5" exonuclease domain of Taf DNA polymerase may be fused to the DNA 
polymerase (dNTP binding and primer/template binding domains) portions of Tag 

15 DNA polymerase (ca. codons 423-832). The donors and recipients need not be limited 
to Tag and Taf DNA polymerases. Tth DNA polymerase provides analogous domains 
as Tag DNA polymerase. In addition, the enhanced/preferred reverse transcriptase 
properties of Tth DNA polymerase can be further enhanced by the addition of a 3'— >5* 
exonuclease domain as illustrated above. 

20 While any of a variety of means may be used to generate chimeric DNA 

polymerase coding sequences (possessing novel properties), a preferred method 
employs "overlap" PCR. In this method, the intended junction sequence is designed 
into the PCR primers (at their 5'-ends). Following the initial amplification of the 
individual domains, the various products are diluted (ca. 100 to 1000-fold) and 

25 combined, denatured, annealed, extended, and then the final forward and reverse 

primers are added for an otherwise standard PCR. 

Thus, the sequence that codes for the exonuclease activity of 1M DNA 

polymerase can be removed from Taf DNA polymerase or added to other polymerases 

which lade this activity by recombinant DNA methodology. One can even replace, in a 

♦ 

30 non-thermostable DNA polymerase, the 3 t ->5 t exonuclease activity domain with the 
thermostable 3'— »5' exonuclease domain of Taf polymerase. Likewise, the 3'-»5' 
exonuclease activity domain of a non-thermostable DNA polymerase can be used to 
replace die 3'->5* exonuclease domain of Taf polymerase (or any other thermostable 
polymerase) to create a useful polymerase of the invention. Those of skill in the art 

35 recognize that the above chimeric polymerases are most easily constructed by 

recombinant DNA techniques. Similar chimeric polymerases can be constructed by 
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deleting or by moving the 5'-»3' exonuclease domain of one DNA polymerase to 
another. 

The following examples are offered by way of illustration only and arc by no 
means intended to limit the scope of the claimed invention. In these examples* all 
5 percentages are by weight if for solids and by volume if for liquids, unless otherwise 
noted, and all temperatures are given in degrees Celsius. 

Example 1 

Purification of Thermosinho africanns fTafl DNA Polymerase T 
^ This example describes the isolation of Tgf DNA polymerase I from Taf . 

10 Taf cells are grown by the method of Huberet aL. supra. The culture of the Taf 

cells is harvested by centrifugation after cultivation, in late log phase, at a cell density of 
0.2 g to .3 g wet weight/1. Twenty grams of cells are resuspended in 80 ml of a buffer 
consisting of 50 mM Tris Ha pH 7.5, 0.1 mM EDTA. The cells are lysed and the 
lysate is centrifuged for two hours at 35,000 rpm in a Beckman TI 45 rotor at 4°C. The 

15 supernatant is collected (fraction A) and the protein fraction precipitating between 45 
and 75% saturation of ammonium sulfate is collected, dissolved in a buffer consisting 
of 0.2 M potassium phosphate buffer, pH 6.5, 10 mM 2-mercaptoethanol, and 5% 
glycerol, and finally dialyzed against the same buffer to yield fraction B. 

Fraction B is applied to a 2.2x30 cm column of DEAE-cellulose, equilibrated 

20 with the above described buffer. The column is then washed with the same buffer and 
the fractions containing protein (determined by absorbance at 280 nM) are collected. 
The combined protein fraction is dialyzed against a second buffer, containing 0.01 M 
potassium phosphate buffer, pH 7.5, 10 mM 2-mercaptoethanol, and 5% glycerol, to 
yield fraction C. 

25 Fraction C is applied to a 2.6 x 21 cm column of hydroxyapatite, equilibrated 

with the second buffer. The column is then washed and the enzyme is eluted with a 
linear gradient of 0.01-0.5 M potassium phosphate buffer, pH7.5, containing 10 mM 
2-mercaptoethanol and 5% glycerol Fractions containing DNA polymerase activity are 
combined, concentrated four-fold using an Amicon stirred cell and YM10 membrane, 

30 and dialyzed against the second buffer to yield fraction D. 

Fraction D is applied to a 1.6 x 28 cm column of DEAE-cellulose, equilibrated 
with the second buffer. The column is washed and the polymerase is eluted with a 
linear gradient of 0.01-0.5 M potassium phosphate in the second buffer. The fractions 
are assayed for cont amin a tin g endonuclease(s) and non-specific exonuclease(s) by 

35 electrophoretically detecting the change in molecular weight of phage X DNA or 
supercoiled plasmid DNA after incubation with an excess of DNA polymerase (for 
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endonuclease) and after treatment of restriction enzyme cleaved DNA with the DNA 
polymerase fractions (for cxonuclease). Only those DNA polymerase fractions having 
minimal non-specific nuclease contamination axe pooled To the pool is added 
autoclaved gelatin in an amount of 250 |ig/ml, and dialysis is conducted against the 
5 second buffer to yield Fraction E. 

Fraction E is applied to a phosphocellulose column and eluted with a 100 ml 
gradient (0.01-0,8 M KC1 gradient in 20 mM potassium phosphate buffer pH 7.5). 
Hie fractions are assayed for contaminating endo/exonuclease(s) as described above as 
well as for polymerase activity (by die method of Kaledin fitflL) and then pooled The 
10 pooled fractions arc dialyzed against the second buffer, and then concentrated by 

dialysis against 50% glycerol and the second buffer to yield the desired -100 kilodalton 
polymerase. 

Example 2 

15 De generate PCR Priming 

Table 1 provides a list of primers used in Examples 2 and 3 along with the 
sequence identification number for each. 

Throughout the examples, A is Adenine; C is Cytidine; G is Guanidine; T is 
Thymidine; Y is C+T (pYrimidine); S is G + C (Strong interaction; three hydrogen 
20 bonds); W is A + T (Weak interaction; two hydrogen bonds); NisA + C + G + T 
(aNy); and R is G + A (puRine). 
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10 



15 



20 



25 



30 



35 



DG144 
DG145 
DG146 
DG147 
DG148 
DG149 
DG152 
DG153 
DG154 
DG155 
DG156 
DG157 
DG160 
DG161 
DG162 
DG163 
DG164 
DG165 
DG166 
DG167 
DG168 
DG169 
DG173 
DG174 
DG175 
DG176 
DG181 
DG182 
DG126 
DG127 
DG128 
DG129 
DG130 
DG131 
DG137 



SEQIDNO:3 
SEQ ID NCh 4 
SEQIDNO:5 
SEQ ID NO: 6 
SEQ ID NO: 7 
SEQ ID NO: 8 
SEQ ID NO: 9 
SEQ ID NO: 10 
SEQ ID NO: 11 
SEQ ID NO: 12 
SEQ ID NO: 13 
SEQ ID NO: 14 
SEQ ID NO: 15 
SEQ ID NO: 16 
SEQ ID NO: 17 
SEQ ID NO: 18 
SEQ ID NO: 19 
SEQ ID NO: 20 
SEQ ID NO: 21 
SEQ ID NO: 22 
SEQ ID NO: 23 
SEQ ID NO: 24 
SEQ ID NO: 25 
SEQ ID NO: 26 
SEQ ID NO: 27 
SEQ ID NO: 28 
SEQ ID NO: 29 
SEQ ID NO: 30 
SEQ ID NO: 60 
SEQ ID NO: 61 
SEQ ID NO: 62 
SEQ ID NO: 63 
SEQ ID NO: 64 
SEQ ID NO: 65 
SEQ ID NO: 66 



"fable 1 
Primer Seonences 

5-OGGAATTCO^GGYARRTTATC 

5'-CGGAATTCX2NGGYARRTTGTC 

5M3GGAATTCCNGGRAGRTTATC 

5-CGGAATTCCNGGRAGRTTGTC 

5-CGGAAT1XXKINGTYTTYTCWCC 

5*-<XX5AATIXXjCNGTYT1YTCSCC 

5'-CGAGATCTGARGCNGAYGATGT 

5'-CX3AGATCTGARGCNGAYGAOGX 

5-CGAGATCTACNGCNACWGG 

5-CX5AGATCTACNGCNACSGG 

5-CGAGATCTCARAAYATHCXZWGT 

5-CX5AGATCTCARAAYATHCCSGT 

5'-CGGAATTCRTCRTGWACCIG 

5'-CGGAA.TTCRTCRTGWACTTG 

5'-CXjGAATTCRTCRTGSACCTG 

5'-CXjGAATTCRTCRTGSACTTG 

5'^XjAGATCTGGNTAYGTWGAAAC 

5'-CX5AGATCTGGNTAYGTWGAGAC 

5'-CX3AGATCTGGNTAYGTSGAAAC 

5'-GGAGATCTGGNTAYGTSGAGAC 

5-OGGAATTCGTYTCNACRTAWCC 

5'-CGGAATTCXjTYTCNACRTASCC 

5'-CGGAATrCATYCKYTCSGC 

S'^XKjAATTCATRCGYTCSGC 

5'-CGGAATIX^TYCKYTCWGC 

5'-<X3GAATrCATRCGYTCWGC 

5'-CGGAATTCNGCNGCNGTSCCYTG 

5'-CGGAATTCNG CNGCNGTWCCYTG 

5'CGGAATIXXjCCCACATWGGYTC 

5'<XK3AATTCX3CCCACATSGGYTC 

5XXJAGATCTCGNGAYGAYCCWATG 

5'-<»AGATCTGGNGAYGAYCCSATG 

5'<XjGAATTCATNGGRTCRTCWCC 

5-CXjGAATTCATNGGRTCRTCSCC 

5'-CGAGATCTGARGGSGARGA 
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10 



DG140 
DG141 
DG150 
DG151 
DG158 
DG159 
DG183 

MK131 
MK143 



SEQIDNO:67 
SEQIDNO:68 
SEQIDNO:69 
SEQIDNO:70 
SEQIDNO:71 
SEQIDNO:72 
SEQJDNO:73 

SEQIDNO:74 
SEQIDNO:75 



Table 1 - Continued 
Primer Sequences 
5-CX5AGATCTGCNCAYATGGAAGC 
5*-CGAGATCTGCNCAYATGGAGGC 
5'^AGATCTGTNTTYGAYGCWAA 
5-CXjAGATCIXjTNTTYGAYGCSAA 
5-GGGAATTCACNGGDATRTTTTG 
5-GGGAATTCACNGGDATRTrCTG 
5'<^TTCCTAATTCCAAATTCXjAAATTGACT- 
GGCX3CX3CGGCCCGGGCGGCCGC 
5'-CCCGGATCAGGTTCTCGTC 
S'-CCGCTGTCCTGGCCCACATG 



15 



20 



25 



A. Protein Sequence Homology 

To underscore the power of the degenerate PCR pruning method of the 
invention, information regarding the amino acid and DNA sequence homology between 
die thermostable DNA polymerases is provided below. Similarity and identity are 
determined using University of Wisconsin sequence analysis programs (Devereux et 
al.. 1984. Nuc. Acids Res. 12(21:387-3951. 

Amino Acid Homology 
Tag E.Coli 
Similarity Identity Similarity Identity 



Tag 
E. coli 
sps17 
ZDS 

3M 



100 
60.8 
91.4 
93.5 
62.3 



100 
41 
84.1 
86.7 
41.5 



60.8 
100 
62.5 
59.6 
62.3 



41 
100 
41.9 
40.4 
41.5 



30 



DNA Sequence Identity 

Tea 

sps!7 83 
ZDS 85 
Taf 44.6 
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B. Calculation of Tm 

Tm is defined as the temperature at which half of the template is dissociated 
from the primer. The equation used for the calculation of Tm is derived from the 
thermodynamic equation: 

5 -RTLqGqO = H° - TAS° 

where R is a constant, T is the temperature On o kelvin), is the dissociation constant, 
H° and S° are the mermodynamic values taken from Breslauer £i ah 1986, Proc. Natl. 
£cM.£ci-II£4.£2:3746-3748. Rearranging the equation: 
T = H° / (AS° - 2.3Rlogio(Kd). 
10 In the presence of primer excess, the T m is defined as: ^ 
T m = H° / (S° - 2.3Rlog 10 [P]), 
where [P] is the concentration of primer. 

The values of H° and S° taken from Breslauer el al. define the T m in the 
presence of 1 M NaCl. To correct for the conditions of the PGR buffer (50 mM salt) 
15 the following correction is made (taken from Dove ££ al., J.M.B. 5:359 (1966): 
T m 0i2) - Tmdii) = 18.51 logioGMii). 
where Hi and \L 2 are the ionic strengths of the buffer as defined by equation: 
|i = 1/2 sum (mZ2). 

With the equations above, one can calculate the Tm for the primer pools used in 
20 the degenerate priming with respect to either Tag orTafDNA polymerase I gene 

sequences. The Tm for various pools are shown below; "all" refers to the total primer 
pool at a concentration of 250 nM, whereas "exact" takes into account the exact 
concentration of the most completely complementary primer in the pool. The 
concentration of the most complementary primer is the total concentration divided by 
25 the degeneracy of the pool. Lower case letters indicate a base pair mismatch relative to 
the Tjgq sequence. The primers were designed to be complementary to the underlined 
regions; 5* sequences incorporate restriction sites to facilitate cloning of the amplified 
fragment 

Forward 

30 TAQ CGGGCTA CGAGGCGGACGACCT ~ 

DG152 CGaGaTctGARGCNGAyGAtGT 

DG153 CGaGaTctGARGCNGAyGACGT 

CONSENSUS CGaGaTc tGARGCWGAvGAvflT 

TAF AAGGCTT TGAAGCTGATGA^AT 
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Tag 
Taf 



Calculated Tnt 
ALL Exact 
73°C 65°C 
S6°C 4 7°C 



10 



REVERSE 

TAQ CTTGAC CCCGGGAAGGTTGTC 

DG144 CggaAt tCCNGGyARRTTaTC 

DG145 CggaAt t CCNGGyARRTTGTC 

DG1 4 6 CggaAt tCCNGGrAGRTTaTC 

DG147 CggaAttCCNGGrAGRTTGTC 

CONS CaoraAt t CCNGG n ARRTT rTCl 

TAF TTTAAC TCCTGGGATATTATn 



15 



Tag 
Taf 



Calculated Tm 

ALL Exact 
68°C 57°C 
53°C 42°C 



WO 91/06202 



42 



PCT/US91/07076 



FORWARD 

TAQ 
DG164 
DG165 
DG166 
DG167 
CONSENSUS 
TAF 



GGAGGCGGGGGTACttTra^ ff a r 

cGAGatctGGNTAyGTwGAaAC 
cGAGatctGGNTAyGTwGAGAC 
cGAGatctGGNTAyGTSGAaAC 
cGAGat Ct GGNTAyGTSGAGAC 
cGAGatCtGGNTAvCTMffAftA.^ 

GGAAAAAA GGTTATGTTACAA , r 



Gal rail 



10 



Jag 

Taf 



All 

62°C 
47°C 



15 



20 



REVERSE 

TAQ 
DG160 
DG161 
DG162 
DG163 
CONSENSUS 
TAF 



Exact- 

52°C 

38°C 



ACeAGCTCGTCflTftrcar^ Tff 

cggAatTCRTCRTGwACCTG 
cggAatTCRTCRTGwACtTG 
cggAatTCRTCRTGSACCTG 
cggAatTCRTCRTGSACtTG 

cqqAatTCRTCRTqKfA^YTff 
ACTAACTCGTf!ATfiA,ftf ^"Tff 



Tag 



Ml Exact 
71°C 62°C 
62°C 53°C 



25 FORWARD 



30 



TAQ 
DG154 
DG155 
CONSENSUS 

TAF 



AGACGGCCACi3G£Caci3SG 
cGAgatCtACNGCNACwGG 
cGAgatCtACNGCNACSGG 
cGAgatCtACJJSCJiacajSfi 
AAACAGG AACTTCTAHf ^ 
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Calfcula i-.ed Tm 

AUL Exact 
£afl 62°C 52°C 
Taf 4 0°C 2 9°C 

5 REVERSE 

TAQ ^TfiAga yprBfiCGGOGG Tt3RCCTG 

DG181 cgGAatTCNGCNGCNGTSCCyTG 

DG182 cgGAatTCNGCNGCNGTwCCyTG 

consensus ^gca «#• TT!M«r.wfirwfyrwr;cvTC - 

10 TAF an-TATOqyAfir.TOCTGTTCCTTG v 

Calculated Tm 

AH Exact 

Tag 89°C 78°C 
Taf 71°C 58°C 

IS C General Methodology 

Standard 2- and 3-temperature profiles were used for screening degenerate 
primers on TZ05 and Tspsl7 (see U.S. Patent application Nos. 590,213, filed 
September 28, 1990, and 590,466, filed September 28, 1990, both incorporated herein 
by reference). However, it was noted early in the work with Taf that the standard 

20 profiles were inadequate. Figure 1 shows a variety of temperature profiles. Figure 2 
shows the effect of temperature profile on the amplification of a purified DNA 
fragment Amplification of Taf chromosomal DNA with the degenerate primer pools 
DG154-DG155 and DG160-DG164 generated the pattern shown in lane 4 (Figure 2). 
The high molecular weight band is the desired fragment (later confirmed by cloning and 

25 DNA sequence analysis). The lower molecular weight bands and the general ethidium 
bromide staining background represent nonspecific amplification which potentially 
might mask specific amplification products as well as interfere significantly with 
cloning of the desired band. 

The desired band was purified from an agarose gel and reamplified using 

30 temperature profiles 2-5 (Figure 1 and lanes 5-8 of Figure 2). Standard 2-temperature 
profiles (lanes 5 and 6) were inadequate in generating amplification of the purified 
band. It appeared that amplification of the small amount of contaminating lower 
molecular weight bands predominated. Li contrast, tempe-nture profile 5 (Figure 3), in 
which the standard plateau at the lower temperature was replaced by a 5 minute ramp 
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extending between the lower temperature and 75°C produced die desired band as die 
predominant product 

Complex temperature profiles were applied to die screening of degenerate 
primer pools with the Taf chromosomal DNA as shown in Figure 1. Generally, 5 
5 series of amplifications were performed with many primer pairs. For profiles 1 and 2, 
an initial 5 cycles of amplification were performed in which a low temperature point 
was programmed (40°C and 45°C, respectively) followed by 25 cycles in which die 
low temperature was programmed at 50°Q In profile 3, 30 cycles were performed 
with the low temperature programmed at 50°G Profiles 4, 5 and 6 increased the low 
10 temperature point by 5°C each and increased the cycle number by 5 or 10 cycles. 

Measurement of in-tube-temperature showed that the temperature in the tube reached 1 
to 2°C above the low temperature setting. 

D. Results 

Amplification products were obtained from PGR amplification of Taf DNA 
15 using the primers listed below. Each amplification yielded products of a molecular 
weight equal to or greater than that obtained from amplification of Tag DNA using the 
same degenerate primers. Mismatches between the Jaf sequence and the degenerate 
primers are shown counting from the 3' end of the primer. 
DG152-DG153 with DG144-DG147 
20 -2(DG152-DG153)and-7(DG144-DG147) 
DG152-DG153 with DG148-DG 149 

-2 (DG152-DG153) and -4 (DG148-DG149) 
DG154-DG155 with DG160-DG163 
-8 (DG154-DG155) 
25 DG154-DG155 with DG173-DG 176 

-8 (DG154-DG155) and -2,-12 (DG173-DG176) 
DG154-DG155 with DG181-DG182 

-8 (DG154-DG155) 
DG156-DG157 with DG168-DG169 
30 -1,-2,-8 (DG156-DG157) 

DG164-DG167 with DG160-DG 163 

-4,-5 (DG164-DG167) 
Magnesium concentration is known to affect amplification efficiency. The 
optimum magnesium concentration depended on both the template and the primer sets 
35 used. With DG144-DG147 and DG152-DG153 the optimum magnesium concentration 
was 3 mM with Taf chromosomal DNA, With the setDG154-DG155 and DG160- 
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DG163 with Tgf DNA, the optimum was 2 mM. For the final bufifer, 2 mM was 
standardly used. 

Example 3 

Isolation of DNA Fragments Encoding Thermosipho afiricanus DNA Polymerase I 
5 This example presents a degenerate primer method used to isolate DNA 

fragments that encode Taf DNA polymerase L In this method, various sets of forward 
and reverse primers were used in the polymerase chain reaction. These primers were 
designed to various conserved motifs comprising the 5'— »3* nuclease domain, the 
template/jprimer binding domain, the dNTP binding domain, or the single-stranded 

10 template DNA binding domain of polymerases of known amino acid sequence ( E- coli . 
T7, and Tag). Primer sequences are provided in the Sequence Listing section; Table 1 
provides the identification number for each primer. 

Pairs of degenerate primers were screened using the sets of six profiles with the 
modified 5-minute ramp profiles as described in example 2. The amount of magnesium 

15 in the amplification was found to effect the amount of PCR product amplified. The 

magnesium optimum depended both on the primer pairs chosen as well as the template. 
For screening of the degenerate primer pools on Taf chromosomal DNA, an average 
magnesium concentration (2 mM) was chosen. 

The PGR conditions thus consisted of 10 mM Tris, pH 83, 50 mM KC1, 2 mM 

20 MgCb, 200 |xM each dNTP, 10 ng chromosomal DNA (at 4.7 x 10* base pairs per 
genome, or 5.2 x 10-15 g/genome, was equivalent to 3.2 x 10-i*M genome), 500 nM 
each oligo primer set, and 2.5-5 units Tag polymerase. A total of 16 pairs of primer 
pools were used, concentrating on sets to amplify the 5 f and 3* end of the coding 
sequence of the polymerase I gene. Of the 16 sets screened, 7 sets (DG152-DG153 

25 with DG144-DG147, DG152-DG153 with DG148-DG149, DG154-DG155 with 

DG173-DG176, DG154-DG155 with DG181-DG182, DG154-DG155 with DG160- 
DG163, DG156-DG157 with DG168-DG169, DG164-DG167 with DG160-DG163) 
produced discrete bands of a molecular weight equal or greater to that of the T&a 
product 

30 Four of the PCR products were selected and cloned (DG152-DG153 with 

DG148-DG149, DG154-DG155 with DG181-DG182, DG154-DG155 with DG160- 
DG163, DG164-DG167 with DG160-DG163). For cloning, the amount of the desired 
product was enriched. The PCR reaction products were extracted with chloroform to 
remove the oil, extracted with phenol/chloroform to remove the Tag polymerase, ether 

35 extracted to remove residual phenol, and concentrated and desalted over a Biogel P-4 
spin column (marketed by Bethesda Research Laboratories). The preparations were 



WO 91/06202 



PCT/US91/07076 



46 

electrophoresed on a 3% low melting NuSieve 1 * GTG agarose gel, and the desired 
band cut out The DNA fragment was isolated from the agarose by repeated phenol 
extractions, ether extractions, and desalting over a Biogel P-4 spin column. 

Hie product was then reamplified with the same primer sets used in the initial 
5 generation using protocol 3 (Figure 3), in which the setting for the initial low- 
temperature was 50°C Hie oil was removed from the reactions by chloroform 
extraction, the polymerase by phenol extraction, and residual phenol by ether 
extraction. Following desalting over a biogel P-4 spin column, the preparations were 
restricted with EcqRI and Bgin according to the manufacturer's specifications. These 

10 sites were included in the incorporated primer sequences to allow for subsequent 
cloning, as shown in Table 2. Hie restriction enzymes were removed by phenol 
extraction, the samples concentrated and electrophoresed on a 3% low melting 
NuSieve™ GTG agarose gel. The target band was isolated as described above. 

Vector pBSM13+HindIH: :Bgl II was prepared by restricting plasmid with Bgin 

IS and EcoR I and dephosphorylating with bacterial alkaline phosphatase. Protein was 
removed by extraction with phenol/chloroform, and the preparation desalted over a 
Biogel P-4 spin column. Vector pBSM13+ (purchased from Stratagene) was used to 
make vector pBSM13+HindHT::Bgin by digesting vector pBSM13+ with restriction 
enzyme Hindin, blunting the ends of the digested vector by Klenow treatment, ligating 

20 Bgin linkers (S'CAGATCTG), transforming host cells, and selecting transformants 
which contained a plasmid identical to pBSM13+ but for the absence of a Hindm site 
and the presence of a Bgin site. 

A sample of the purified fragment and prepared vector were ligated at 10°C for 
15 hours, transformed into DG98, and a sample of the trasformed bacteria plated on 

25 ampicillinK»ntaining agar plates, Ampicillin-resistant colonies were isolated, and crude 
plasmid prepared. The correct clones were identified by comparing the size of the 
insert following restriction of the crude plasmid with EcoRI and B gin with that of the 
initial PCR product, and comparing the digestion pattern of both the cloned insert with 
that of the initial PCJR product using a variety of restriction endonucleases. Single- 

30 stranded DNA was prepared from selected clones and the sequence determined by 
standard dideoxy sequencing methods. The amino add sequence deduced from the 
DNA sequence contained significant homology to known polymerase sequences, 
suggesting that the PCR products were in feet derived from a Taf polymerase gene. 
This strategy resulted in the successful amplification and cloning of various 

35 regions of the Jgf DNA polymerase gene with the primer pairs shown in Table 2. The 
primers were designed to be complementary to sequences coding for the amino acid 
sequences shown; upstream sequences incorporate restriction sites used in the cloning 
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of the amplified product In Table 2, the amino acid sequence shown below the DN A 
sequence for the reverse primer is given in the carboxy to amino direction and encoded 
by the complement of the sequence. The primers shown in Table 2 are characterized as 
follows. 

5 Synthetic ohgodeoxyribonucleotides DG148 and DG149 are two different 

32-fold degenerate (each) 22 mer pools designed as "reverse" primers to one of the 
motifs in the 5' to 3* exonuclease domain (3* most 14 nucleotides) of thermostable 
DNA polymerases. The primers are designed to complement the (+)-strand DNA 
sequence mat encodes the motif Gly-Glu-Lys-Thr-Ala and which corresponds 

10 identically to Tjjg, DNA polymerase amino acids 200 through 204 and to Tth DNA 
polymerase amino acids 201 through 205. This motif is found in a DNA polymerase 
gene in all Thermus species. The combined primer pool is 64-fold degenerate and the 
primers encode anEcoRI recognition sequence at their 5'-ends. 

Synthetic ohgodeoxyribonucleotides DG152 and DG153 are two different 

15 16-fold degenerate (each) 23 mer pools designed as "forward" primers to one of the 
motifs in the 5' to 3* exonuclease domain (3* most 14 nucleotides) of thermostable 
DNA polymerases. This motif is the amino acid sequence Glu-Ala-Asp-Asp-Val and 
corresponds identically to Tag DNA polymerase amino acids 1 17 through 121 and to 
Tth DNA polymerase amino acids 118 through 122. This motif is found in a DNA 

20 polymerase gene in all Thermus species. The combined primer pool is 32-fold 
degenerate and the primers encode a Bgin recognition sequence at their 5'-ends. 

Synthetic ohgodeoxyribonucleotides DG154 and DG155 are two different 
32-fold degenerate (each) 19 mer pools designed as "forward" primers to one of the 
motifs in the primentemplate binding domain (3* most 11 nucleotides) of thermostable 

25 DNA polymerases. This motif is the tetrapeptide amino acid sequence Thr-Ala-Thr-Gly 
and corresponds identically to Tfeg DNA polymerase amino acids 569 through 572, and 
to Tth and Thermus species ZQ5 DNA polymerase amino acids 57 1 through 574. This 
motif is found in a DNA polymerase gene in all Thermus species. The combined 
primer pool is 64-fold degenerate and the primers encode a Bgin recognition sequence 

30 at their 5'-ends. 

Synthetic ohgodeoxyribonucleotides DG160 through DG163 are four different 

8-fold degenerate (each) 20 mer pools designed as "reverse" primers to one of the 

motifs in the template binding domains (3' most 14 nucleotides) of thermostable DNA 

polymerases. The primers are designed to complement the (+)-strand DNA sequence 

35 that encodes the motif Gin- Val-His-Asp-Glu and which corresponds identically to Tag, 

DNA polymerase amino acids 782 through 786, and to Tth and Thermus species Z05 

DNA polymerase amino acids 784 through 788. This motif is found in a DNA 
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polymerase gene in all Thermos species. The combined prima- pool is 32-fold 
degenerate and the primers encode an EcoRI recognition sequence at their 5'-ends. 

Synthetic oligodeoxyribonucleotides DG164 through DG167 are four different 
16-fold degenerate (each) 22 mer pools designed as "forward" primers to one of the 
5 motifs in the template binding domain (3 r most 14 nucleotides) of thermostable DNA 
polymerases* This motif is the pentapeptide amino acid sequence Gly-Tyr-Val-Glu-Thr 
and corresponds identically to Tag DNA. polymerase amino adds 718 through 722, to 
Tth and Thermus species Z05 DNA polymerase amino acids 720 through 724. This 
motif is found in a DNA polymerase gene in most Thermus species. The combined 
10 primer pool is 64-fold degenerate and the primers encode a Bglll recognition sequence 
at their 5'-ends. 

Synthetic oligodeoxyribonucleotides DG181 and DG182 are two different 
256-fold degenerate (each) 23 mer pools designed as •"reverse" primers to one of the 
motifs in the template binding domain (3' most 17 nucleotides) of thermostable DNA 

15 polymerases. The primers are designed to complement the (+)-strand DNA sequence 
that encodes the motif Gln-Gly-Thr-Ala-Ala-Asp and which corresponds identically to 
Tag DNA polymerase amino acids 754 through 759 and to Tjh DNA polymerase amino 
acids 756 through 761, This motif is found in a DNA polymerase gene in all Thermus 
species. The combined primer pool is 5 12-fold degenerate and the primers encode an 

20 EcoRI recognition sequence at their 5 -ends. 
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The entire coding sequence for Taf polymerase was then identified. 3M 
chromosomal DNA was digested with £gmHI, BgllL Qal, EcqRI, Eindlll, KpnL 
Pstl . SacL and Sail according to the manufacturer's specifications and electrophoresed 
(with radioacrively labeled HmdJH-digested lambda DNA as a molecular weight 
5 maiker) on a 0.7% agarose gel. The gel was acid nicked in 0.25 N HO (30 minutes), 
and transferred to HybondN+™ nylon membrane (marketed by Amersham) by capillary 
action in 0.4 N NaOH for 19 hours. The DNA was cross-linked to the membrane by 
irradiating with 50 mjoules by a Stratalinker™ 1800 (marketed by Stratagene) and 
treated with prehybridizadon buffer. 

10 Radioactive probes were generated from the regions encoded between the 

primer pairs DG160-DG163 and DG164-DG167, and DG144-DG147 to DG152- 
DG153. Initial PGR product was generated, confirmed by restriction analysis, and 
purified as described above. Amplification was then repeated using a sample of the 
purified PCR product as the template, and replacing the dGTP in amplification with 

15 50pJM a-32p-dGTP. The oil was removed by chloroform extraction, the polymerase 
by extraction with phenol/chloroform, the sample concentrated, and unincorporated 
label removed by desalting over a Biogei P4 spin column. The preparation was 
electrophoresed on a 3% low melting NuSieve™ GTG agarose gel, and the target 
radioactively labeled band isolated as described above. 

20 The 3* end of the coding sequence for the polymerase gene was identified by 

hybridizing the chromosomal blots with 3.6 x 106 cpm of probe Taf DG 1 60-DG 1 63 to 
DG164-DG167 at 50°C for 17 hours. The blots were washed twice with 2 X SSPE, 
0.1% SDS at 23°C for 10 to 25 minutes, then 1 X SSPE, 0.1% SDS at 52°C for 20 
minutes and autoradiographecL Discrete bands hybridizing to the probe were identified 

25 and their molecular weights determined by comparison to the radioactively labeled 

lambda markers. Thus, the 3' end of gene was located within the following fragments. 
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Enzyme Molecular weight (toYof fragment containing 3'end 

B_ajnHI 6,000 or 21,000 

Bell i 2,330 

Clal 8,100 

5 EcoRI 4,900 or 6,800 

jffindm 5350, 3,000 or 1,680 

Kpnl 19,500 

PstI 1,410 or 18,000 

SSSLI >23,000 

10 Ml , >23,000 



The portion of gene encoding the 5* end of the polymerase sequence was then 
identified. The 3' probe was removed by boiling the blots in 0.5% SDS, and the 
membranes hybridized to 3.0 x 106 cpm of Taf probe DG152-DG153 to DG148- 
DG149 at 66°C for 22 hours. The membranes were washed twice in 2 X SSPE, 0.1 % 
15 SDSat23°Cfor 10 minutes and IX SSPE, 0.1% SDS at 65°C for 30 minutes, and 
autoradiographed. The following fragments were therefore identified as containing 
sequences mat code for the 5' end of the polymerase gene: 



Enzyme Molecular weight (bp\ of fragment containing 5'enri 

BamH I 20,000 

20 figUI 2,280 

Oal 7,500 

EcoRI 6,800 

ifindffl 2,350 

Kpnl 19,500 

25 £§tl 16,000 

SacI 21,000 

SaH >23,000 



From the two hybridization patterns, it was determined that the gene contains both 
Bgin and Hin dDI sites. In addition, a 6,800 bp EcoRI fragment contains sequences 

30 coding for both the 3' and 5 ' ends of the polymerase sequence. 

The 6,800 bp EcoR I fragment containing the entire gene was then cloned from 
the chromosome. Taf chromosomal DNA (20 jig) was digested with EcoRI according 
to the manufacturer's specifications. The completion of digestion was confirmed by 
electrophoresis of a sample on a 0.7% agarose gel, acid nicking, transferring to 

35 HybondN+ m in 0.4 N NaOH, and probing with radioactively labeled Taf PCR product 
extending between DG160-DG163 and DG164-DG167. The complete digest was size 
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fractionated by clectioelution on a 0.5% SeaKem™ agarose LB gel in TEA and fractions 
collected The fractions containing the target E^RI fragment were identified by 
electrophresis on a 0.7% agarose gel, which was then acid nicked, transferred to 
HybondN+ w in 0.4 M NaOH, and hybridized to radioactively labeled JafPCR product 
5 extending between DG160-DG163 and DG164-DG167. The fractions containing the 
6,800 bp EcoRI fragment were pooled, concentrated, and desalted over a Biogel P4 
spin column. 

Three vectors were prepared by digesting pBR322, pUC13, and 
pBSM13+Hindm::BgUI with EcoR I. dephosphoiylating with bacterial alkaline 

10 phosphatase, extracting with phenol/chloroform and then ether, and desalting, over a 
biogel P-4 spin column. The size fractionated material containing the 6,800 bp-EcoRI 
fragment was ligated into the vectors, transformed into DG98, and the transformation 
mixture plated onto ampicillin-containing agar plates. 

Following growth at 37°C for 16 hours, the colonies were lifted onto 

15 nitrocellulose filters, lysed with triton lytic buffer, the DNA denatured using 0.5 M 
NaOH, 1 M NaCl, neutralized with 0.5 M Tris, pH 8.0, 1.0 M NaCl, rinsed with 0.3 
M NaCl, 10 mM Tris, pH 7.6, 1 mM EDTA, pH 8.0, and baked at 80°C for 3 hours. 
The filters were incubated with prehybridization buffer at 65°C for 1 hour and 
hybridized with 4.4 x 105 CPM of radioactively labeled Taf PCR product extending 

20 between DG160-DG163 and DG164-DG167 for 15 hours at 50°C. The filters were 
washed in 5 X SSC, 0.1 % SDS at 23°C for 16 minutes, 2 X SSC, 0.1% SDS at 23°C 
for 30 minutes, and autoradiographed. 

Probe positive colonies were inoculated into broth containing ampicillin and 
methicillin and grown at 37°C. The correct clones were identified by isolating plasmid 

25 DNA followed by restriction enzyme analysis. Clones containing a 6.8 kb insert were 
identified by restriction with EcoRL The correct clones were further identified by 
restriction analysis with HindllL Hindm and EcoRI or Bgl lL because it was 
determined in the chromosomal mapping that the polymerase gene contained both 
Hindm and Bgffl sites. 

30 For further confirmation, the restriction digests of the suggested clones were 

elcctrophoresed on a 0.7% agarose gel, and die DNA bands transferred to HybondN+™ 
and probed with radioactively labeled Taf PCR product extending between DG 1 60- 
DG163 and DG164-DG167, and subsequently with radioactively labeled Taf PCR 
product extending between DG144-DG147 and DG152-DG153 as previously 

35 described. Several clones were confirmed as correct (52-1, 52-2, 52-3, 52-6, 52-7, 
and 52-9). 
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To facilitate sequencing and subsequent manipulation of die polymerase gene 
for die construction of expression vectors, a smaller 3,000 bp EcoR V fragment was 
subcloned from the larger 6 9 800 bp EcoRI fragment. Clone 52-1, containing the 6,800 
EcoR I fragment was digested with EcoRV. according to the manufacturer's 
5 specifications, concentrated, desalted over a biogel P-4 spin column, and 

electrophoresed on a 1% low melting NuSieve™ GTG agarose geL The target band 
was then purified as previously described. 

For the vector, pBSM13+HindIII::BglII was restricted with Smal and 
dephosphorylated by bacterial alkaline phosphatase. The protein was removed by 

10 phenol/chloroform extraction, and the preparation desalted over a Biogel P-4 spin. 

column. A sample of the vector and purified fragment was ligated with T4 DNA ligase 
and T4 RNA ligase at 23°C for 7 hours, transformed into DG98 and the transformation 
mixture plated on ampicillin-containing agar plates. Ampidllin-iesistant colonies were 
selected and grown in liquid broth, and crude plasmid preparations isolated. 

15 The size of the insert was determined by restriction with EcoRI and BamH L 

which cut the vector on both sides of the Smal site that contained the insert. The 
identity of the insert was further confirmed by restriction with Bgll L a site previously 
determined to be within the gene from the mapping of the chromosome (described 
above), EcoRI with Q&I. and EcoRI with SpeL Clones were identified which 

20 contained the polymerase gene in both orientations. The orientation that placed the 
coding sequence in position for expression from the lag promoter was designated 
pBSMiTafEooRV. 

Example 4 

Construction of Thermosipho africanus DNA Polymerase I Expression Vectors 
25 The entire 3M DNA polymerase I coding sequence can be isolated fiomTaf 

genomic DNA on an approximately 3 kb EcoR V fragment This EcoRV fragment was 
isolated and cloned into the Stratagene™ vector pBSM13+, which had first been 
digested with restriction enzyme SmaL The resulting vector was designated 
pBSMrTafEcoRV, and the orientation of the 2M gene EcoRV DNA fragment is such 
30 that the ]ac promoter, ribosome-binding site (RBS), and ATG start codon for the 
coding sequence of beta-galactosidase from the pBSM13+ vector are positioned for 
expression of theTafDNA polymerase I coding sequence. The ATG start codon of the 
Taf DNA polymerase I coding sequence is about 20 bp from the EcoRV restriction 
enzyme recognition site, which is, in turn, about 84 bp from the ATG of the beta- 
35 galactosidase coding sequence. 
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Oligonucleotide site-directed mutagenesis was then used to alter the carboxy 
terminus encoding region of theTafDNA polymerase I coding sequence in plasmid 
pBSM:Ta£EcoRV. Single-stranded plasmid DNA was prepared by infecting a log 
phase culture of DG98 harboring the plasmid with helper phage R408. Single-stranded 
5 DNA was recovered and purified via electroelution. Gapped-duplex DNA was formed 
between the single-stranded pBSM:TafEcoRV and the large PvuII fragment of vector 
pBSM13+, and thai the gapped duplex was annealed with mutagenic oligomer, either 
DG233 or DG234. Extension and ligation of die reactions containing mutagenic 
oligomers annealed to gapped duplex was performed, and the mixtures were 

10 transformed into DG10L Transformed colonies on nitrocellulose filters were screened 
by hybridization with y-32P-labeled oligomer DG235* Mini-screen DNA prepared 
from positive single colonies was analyzed by restriction analysis to confirm the 
presence of a new BamH I site, loss of a Bgin site, and the appropriate PvuI I pattern. 
DNA sequence analysis confirmed the mutagenesis. 

15 The sequences of the mutagenic and probe oligomers are shown below. 

DG233 SEQIDNO: 31 S-GOGAATTCGAGCTOGGTACC- 

GGATCXnX^TTXXX^CTXJ'niT^ 
DG234 SEQIDNO: 32 S^CCTITACCXX^ 

TOXACTC1T1TCC 

20 DG235 SEQIDNO: 33 5-GATCCTCATTXXX}ACTC 

Mutagenesis with DG233 changed the TAA stop codon to TGA, created a 
BamH I restriction site immediately following the new TGA stop codon, and deleted Taf 
and vector sequences to the Kpnl site in the polylinker of the vector, a deletion of 213 
bp. One of the correct mutants from the DG233 mutagenesis was designated pTafOl . 

25 Mutagenesis with DG234 changed the TAA stop codon to TGA and created a 

new BamH I site directly downstream of the TGA stop, but deleted no Taf or vector 
sequences downstream of the BamH I site. One of the correct mutants from the DG234 
mutagenesis was designated pBSM:TafRV3* and can be used to construct expression 
vectors as illustrated with pTafOl, below. 

30 Oligonucleotide site-directed mutagenesis was used to alter the 5'-end of the Taf 

DNA polymerase I gene in pTafOl. Mutagenesis was as described above, using 
mutagenic oligonucleotide DG248 to insert an Ncol restriction site at the ATG start of 
the Taf DNA polymerase I coding sequence and to delete vector and Taf sequences to 
make the lacZ ATG start codon the start codon for theTafDNA polymerase I coding 

35 sequence. Transformed colonies on nitrocellulose filters were screened by 

hybridization with Y-32P-labeled oligonucleotide DG237. The sequences of the 
mutagenic and probe oligomers are shown below. 
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DG248 SEQ ID NO: 34 y-CAAATAGAAAGATCTTTCGC 

ATGGCTOrrKXJroTGTCAAATrG 
DG237 SEQ ID NO: 35 5 f -GAAACAGCCATGGGAAAG 

Mini-screen DNA prepared finom positive colonies was subjected to restriction 
5 analysis to confirm the presence of the new Ncol site and the deletion. DNA sequence 
analysis was also performed to ensure that the correct sequence was obtained. The 
correct plasmid was designated pTaf02. IPTG-induced cultures harboring pTafD2 
expressed heat-stable polymerase activity at 24 units per mg crude extract protein 
(where a pBSM13+ control culture was 0.04 units per mg crude extract protein, and the 

10 pBSM:TafEcoRV culture was 6.5 units per mg crude extract protein). 

The 2.7 kb NcoI-BamHI DNA fragment comprising the Taf DNA polymerase I 
coding sequence in pTaf02 was cloned into four P L expression plasmids, pDG182- 
pDG185, which had been digested with Nco l and BamH L Plasmids pDG182 and 
pDG184 are derivatives of pDG160, and pDG183 and pDG185 are derivatives of 

15 pDG161. The construction of plasmids pDO160 and pDG161 is described in Example 
6 of Serial No. 455,967, filed December 22, 1989, the entire disclosure of which is 
incorporated herein by reference. The preferred host for such expression vectors is E. 
goli K12 strain DG116, and culture of die host cells and induction of expression is 
carried out as described in Example 7 of Serial No. 455,967. 

20 To construct expression vectors pDG182-pDG185, plasmids pDG160 and 

pDG161 were digested with restriction enzymes Mrol and KpnL and the smaller of the 
resulting two fragments was replaced with a duplex adaptor linker, either FL42/FL43 
or EL44/EL45, and the vector recircularized by ligation. The sequence of the duplex 
adaptor linkers EL42 (SEQ. ID NO: 36); EL43 (SEQ. ID NO: 37); EL44 (SEQ. ID 

25 NO: 38); and FL45 (SEQ. ID NO: 39) are shown below. 



WO 92/06202 



PCT/US91/07076 



57 

FL420PL43 

S'-CXXKjAAGAAGGAGAAAATACCATGGGCCCXjGTAC-S' 
3 , -TTCTTCCTC^TTTATGGTACCCX3GGC-5 , 



FL44/FL45 

5 5-CXXJGAGGAGAAAATCCATGGGCCCGGTAC-3 , 
3 , -TCCTCTTTTAGGTACCXXKJGC-5 , 
The following table describes the properties of plasmids pDG182-pPG185. 

Oligonucleotide Duplex 



Vector 


AmpRorTetR 


RBS 


SiteatATG 


Cloned into dDG160 or dDG161 


pDG182 


Amp 


T7 


Ncol 


FL42/FL43-pDG160 


pDG184 


Amp 


N 


Nool 


FM4/FL45-pDG160 


pDG183 


Tet 


T7 


Ncol 


FL42/FL43-pDG161 


pDG185 


Tet 


N 


Ncol 


FL44/FL45-pDG161 



In addition to the features tabulated above,the pDG182-pDG185 vectors also contain 
15 the 6-toxin positive retroregulator from Bacillus thuringiensis and point mutations in 

the RNA II gene which render the plasmids temperature sensitive for copy number. 
Derivatives of pDG182-pDG185 containing the 2.7 kb Ncol to BaffiHI 

fragment are pTaf03 (from pDG182), pTaf04 (from pDG183), pTaf05 (from 

pDG184), and pTaf06 (from pDG185). These plasmids produce Taf DNA polymerase 
20 I activity when expression is induced 

Example 5 

PCR with Taf DNA Polymerase 
About 1.25 units of the Taf DNA polymerase purified in Example 1 is used to 
amplify sequences from 2Jh genomic DNA. The reaction volume is 50 ^1, and the 
25 reaction mixture contains 50 pmol of primer DG73, 105 to 10$ copies of the Tth 

genome (-2 x 105 copies of genome/ng DNA), 50 pmol of primer DG74, 200 jiM of 
each dNTP, 2 mM MgCl 2 , 10 mM Tris-HCl, pH 8.3, 50 mM KQ, and 100 jig/ml 
gelatin (optionally, gelatin may be omitted). 

The reaction is carried out on a Perkin-Elmer Cetus Instruments DNA Thermal 
30 Cycler. Twenty to thirty cycles of 96°C for 15 seconds; 50°C for 30 seconds, and 75°C 
for 30 seconds are earned out At 20 cycles, the amplification product (160 bp in size) 
can be faintly seen on an ethidium bromide stained gel, and at 30 cycles, the product is 
readily visible (under UV light) on the eithidium bromide stained geL 
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The PGR may yield fewer non-specific products if fewer units ofTafDNA 
polymerase are used (Le. t 0,31 units/50 pi reaction). Furthennore, the addition of a 
non-ionic deterent, such as laureth-12, to the reaction mixture to a final concentration of 
1 % can improve the yield of PCR product 
5 Primers DG73 and DG74 are shown below: 

DG73 SEQIDNO: 40 5 1 -TAOGTKXXXK3GCCTTGTAC 
DG74 SEQIDNO: 41 5 • -AGGAGGTGATCCAACCXjCA 

Example 6 

Expression of modified Taf polymerase 

10 In an effort to increase the expression levels, site specific mutagenesis was 

performed to (1) remove the predicted hairpin structure from codons 2 to 6 of the 
coding sequence; and (2) change codons 2, 5, 6, 7, 9, and 1 1 to codons used more 
commonly in coli than the codons present in the native sequence. Mutagenic 
primers, FR404 or FR405, each containing a modified sequence were synthesized and 

15 phosphorylated. The modified codon 9 and codon 10 form a Kpnl site. Single- 
stranded pTaf02 was prepared by coinfecting a log phase culture of DG101 containing 
the plasmid with the helper phage R408, commercially available from Stratagene. A 
"gapped duplex" of single stranded pTaf02 and the large fragment from the PvuI I 
digestion of pBSM13+ was created by mixing the two plasmids, heating to boiling for 

20 2 minutes, and cooling to 65°C for 5 minutes. Mutagenic primer FR404 or FR405 was 
then annealed with the "gapped duplex** by mixing, heating to 80°C for 2 minutes, and 
then cooling slowly to room temperature. Hie remaining gaps were filled by extension 
with Klenow and die fragments ligated with T4 DNA ligase, both reactions taking place 
in 200 nM of each dNTP and 40 nM ATP in standard salts at 37°C for 30 minutes. 

25 The resulting circular fragment was transformed into DG101 host cells by plate 

transformations on nitrocellulose filters. Duplicate filters were made and the presence 
of the correct plasmid was detected by probing with a ^P-phosphoiylated probe; 
FR401 was used to screen for the product of mutagenesis with FR404 and FR399 was 
used to screen for the product of mutagenesis with FR405. Mini-screen DNA prepared 

30 from positive colonies was subjected to restriction analysis to confirm the presence of 
the Ncol site from pTaf02 and the introduced Kpnl site, after which the DNA sequence 
was confirmed. Two expression vectors were produced by the above protocol; the 
vector created using FR404 was designated pTaff)7 and the vector created using FR405 
was designated pTafOS. 

35 The oligonucleotide sequences used in this example are listed below. 
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Olieo 


SFO TD NO: 


Sequence 


FR399 


SEQIDNO:42 


S'-TAAGATGTTCTTGTTC 


FR401 


SEQ ID NO: 43 


5-TAAGATGTTCCTGTTC 


FR404 


SEQIDNO:44 


5-ATACTAAACCGGTACCAT- 






CX3AACAGGAACATCTTACXXIATGGC 


FR405 


SEQ ID NO: 45 


5-ATACTAAACGGGTAOCA- 



TCX3AACAAGAACATCITACCCATGGC 



Example 7 
Expression of Truncated Taf Polymerase 

10 Mutein forms of the TM polymerase lacking 5* -> 3 f exonuclease activity were 

constructed by introducing deletions in the 5 • end of the gene. Both 279 and 417 base 
pair deletions were created using the following protocol; an expression plasmid was 
digested with restriction enzymes to excise the desired fragment, the fragment ends 
were repaired with Klenow and all four dNTP's, to produce blunt ends, and the 

15 products were ligated to produce a new circular plasmid with the desired deletion. To 
express a 93 kilodalton, 5' -> 3' exonuclease-deficient form of Taf polymerase, a 279 
bp deletion comprising amino acids 2-93 was generated To express an 88 kilodalton, 
5« 3' exonuclease-deficient form of Taf polymerase, 417 bp deletion comprising 
amino acids 2-139 was generated. 

20 To create a plasmid with codons 2-93 deleted, pTaf03 was digested with Ncol 

and Ndel and the ends were repaired by Klenow treatment. The digested and repaired 
plasmid was diluted to 5 |ig/ml and ligated under blunt end conditions. The dilute 
plasmid concentration favors intramolecular ligations. The ligated plasmid was 
transformed into DG1 16. Mini-screen DNA preparations were subjected to restriction 

25 analysis and correct plasmids were confirmed by DNA sequence analysis. The 

resulting expression vector created by deleting a segment from pTaf03 was designated 
pTaf09. A similar vector created from pTaf05 was designated pTaflO. 

Expression vectors also were created with codons 2-139 deleted. The same 
protocol was used with the exception that the initial restriction digestion was performed 

30 with Ncol and gglll. The expression vector created from pTaf03 was designated 
pTafl 1 and the expression vector created from pTaf05 was designated pTafl2. 
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Example 8 
Expression Vectors With T7 Promoters 
Expression efficiency can be altered by changing the promoter and/or ribosomal 
binding site (RBS) in an expression vector. The T7 genelO promoter and RBS were 
5 used to control the expression ofTgf DNA polymerase in expression vector pTafl3, 
and the T7 gene 10 promoter and the gene N RBS were used to control the expression 
of Taf DNA polymerase in expression vector pTafl4. The construction of these 
vectors took advantage of unique restriction sites present in pTaf05: an Afin site 
upstream of the promoter, an Ncol site downstream of the RBS, and a BspEI site 

10 between the promoter and the RBS. The existing promoter was excised from pTaf05 
and replaced with a synthetic 17 gene 10 promoter using techniques similar to those 
described in the previous examples. 

The synthetic insert was created from two overlapping synthetic 
oligonucleotides. To create pTafl3, equal portions of FR414 and FR416 were mixed, 

15 heated to boiling, and cooled slowly to room temperature* The hybridized 

oligonucleotides were extended with Klenow to create a full-length double-stranded 
insert. The extended fragment was then digested with Afin and NcoL leaving the 
appropriate sticky ends. The insert was cloned into plasmid pTafOS digested with Afin 
and NcoL DG1 16 host cells were transformed with the resulting plasmid and 

20 transfoimants screened for the desired plasmid. 

The same procedure was used in the creation of pTafl4, except that FR414 and 
FR418 were used, and the extended fragment was digested with Afin and BspEI. This 
DNA fragment was substituted for the Pl promoter in plasmid pTa£05 that had been 
digested with Afin and BspEL 

25 Plasmids pTafl3 and pTafl4 are used to transform coli host cells that have 

been modified to contain an inducible T7 RNA polymerase gene. However, because 
T7 RNA polymerase may not recognize the 5-toxin retroregulator terminator sequence 
present in the plasmid vector, it may be desirable to clone the T7 gene 10 terminator 
sequence into pTafl3 or pTafl4. 

30 The T7 gene 10 terminator sequence was first cloned into a small, high copy 

number £. cqU cloning vector, pUC19, available as ATCC 37254 (see Yanisch-Perron, 
StfiL, 1985, fisn£22:103-119). Synthetic oligonucleotides, HW73 and HW75, were 
annealed to provide the T7 gene 10 terminator sequence flanked by Hind m sticky ends. 
The pUC19 plasmid was digested with Hindm and ligated with the HW73/HW75 

35 duplex. The resulting plasmid, designated pTW66, was transformed into DG101 and 
screened for orientation by restriction enzyme digestion and DNA sequence analyses. 
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A second vector was created from pUC19 by inserting the T7 promoter 
sequence. Synthetic oligonucleotides, HW71 and HW72, were annealed to provide the 
T7 promoter sequence flanked by BamH I sticky ends. The pUC19 plasmid was 
digested with BamH I and ligated with the HW71/HW72 duplex. The resulting 
5 plasmid, designated pTW64, was transformed into DG101 and screened for orientation 
by restriction analyses and sequence analysis. 

A 95 bp fragment containing the T7 promoter was isolated from pTW64 by 
digestion with EcoRI and Hind in and separation of the restriction fragments by gel 
electrophoresis. The pTW66 plasmid was digested with EcoR I and Hind in and ligated 
10 with the purified fragment from the digestion of pTW64. The resulting vector, 

designated pTW67, contains both die T7 promoter sequence and gene 10 terminator 
sequence. 

The T7 gene 10 terminator sequence is excised from the pTW67 vector by 
digestion with Xhol and Sail . Hie vector is also cut with Pvul l to reduce background. 
15 The pTafl3 vector is cut with Sail which cleaves at a unique site just downstream of the 
existing terminator. Digestions with Xhol and Sail leave the same sticky end for 
ligation. The fragment containing the T7 gene 10 terminator sequence is ligated with 
the cleaved pTafl3. The resulting plasmid, designated pTaf 16, is transformed into 
MM294 and screened for orientation. 
20 The expression plasmid pTafl6, which contains the T7 genelO promoter, RBS, 

and the T7 gene 10 terminator, is transformed into an E. coli host cell modified to 
contain an inducible T7 RNA polymerase gene. 

The oligonucleotides used in the construction of these vectors are listed below. 
FR414 SEQ ID NO: 46 5 t -TCAGCTTAAGACITCGAAATTAATA- 
25 CGACTCACTATAGGGAGACCACAA- 

CXXjTTTCCCTC 
FR416 SEQ ID NO: 47 5-TX^ACCATGGGTATATCTCCTT- 

CTrAAAGTTAAACAAAATrATTTC- 
TAGAGGGAAACCGTTG 
30 FR418 SEQ ID NO: 48 5'-TCAGTCCGGATAAACAAAA- 

TTATTIUTAGAGGGAAACCXnTG 
HW71 SEQ ID NO: 49 S^ATCACTTOjAAATTAA- 

TACX5ACTCACTATAGGGAGACXX} 
HW72 SEQ ID NO: 50 5-GATC(X5GTCTCCCTATAGTGAG- 
35 TCXjTATTAATTIXXjAAGT 
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HW73 SEQIDNO:51 SVAGCITTAAAGATCTAATAACTA- 

GC^TAACCCCITCKK^ 
CXjGGTCITGAGGGGTTI^^ 
CTCGAG 

5 HW75 SEQIDNO: 52 5-AGCT<nXXiAGTCAGCA/W^ 

TCAAGACtXX3TTTAGAGGCCCCAA- 
GGGGTTATGCrAGTTATTAGATCTTrAA 

Example? 

Ttamslational Coupling A 

10 To effect the translation of the Taf polymerase gene, translationally coupled 

derivatives of Taf expression vectors were constructed. An expression vector was 
constructed with a secondary translation initiation signal and short coding sequence just 
upstream of the 2M gene coding sequence such that the stop codon for the short coding 
sequence is coupled, i.e. f overlaps, with the ATG start codon for the Taf gene coding 

15 sequence. Translation of the short coding sequence brings the ribosome into close 

proximity with the lef gene translation initiation site, thereby enhancing translation of 
the Taf gene. 

Translationally coupled Taf expression vectors were constructed with the 
translation initiation signal and first ten codons of the T7 bacteriophage major capsid 

20 protein (gene 10) fused in-frame to the last six codons of the E- SOU TrpE gene placed 
upstream of the Jaf coding region. The TGA (stop) codon for TrpE is "coupled" with 
the ATG (start) codon for the Taf gene, forming the sequence TGATG as it is coupled 
with the ATG (start) codon for TrpD on theg. coli chromosome. A one base frame- 
shift is required between translation of die short coding sequence and translation of the 

25 Taf coding sequence. 

In the example below, a fragment containing the T7 gene 10-E- coli TrpE/TrpD 
fusion product (the last 6 codons and TGA stop codon from TrpE along with the 
overlapping ATG start codon from TrpD) was obtained from a pre-existing plasmid. 
One of ordinary skill will recognize that the T7 gene 10-E. coli TrpE/TrpD fusion 

30 product used in die construction of the translationally coupled expression vectors can be 
constructed from synthetic oligonucleotides. The sequence for the inserted fragment is 
listed below. 

Hie 17 gene 10-JE. coli TrpE/TrpD fusion product was amplified using plasmid 
pSYC1868 and primers FL48 and FL50. FL52 and EL54 were used to amplify the 5 ■ 
3 5 end of the Taf Pol I gene in pTaf02 from the ATG start codon to the Bgin site 

downstream of the ATG start codon. The primers FL50 and FL52 were designed to be 
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partially complementary. Consequently, the extension product of FL48 can hybridize 
to the extension product of FL54. The two amplification products were mixed, heated 
to 95°C and slowly cooled to room temperature to anneal. Hybrids formed between the 
extension products of FL48 and FL54 were extended with Tag polymerase to form a 
5 full length double-stranded molecule. 

The extended insert was amplified with primers FL48 and FL54 and then 
digested with Mrol and BglH Plasmid pTa£D3 was digested with Mrol and PgllL then 
treated with calf intestine alkaline phosphatase to prevent re-ligation. The digested 
pTaf03 was ligated with the insert DG116 host cells were transformed with the 
10 resulting construct and transf ormants screened for the desired plasmid DNA. The 
resulting vector was designated pTaflS. 

The sequences of the oligonucleotide primers and the T7 gene 10-g. coli 
TrpE/TrpD fusion product (gene 10 insert) are listed below. 

Primers SEQIPNQ; Seoufince 

15 FL48 SEQIDNO:53 5-TGCXXjACTTTAAGAAGGAGATATAC 

FL50 SEQ ID NCh 54 5-AACATXJITACCC^TCAGAAAGTCTCCT 

FL52 SEQ ID NO: 55 5 -AGACTTTCTGATGGGTAAGATGTTC 

EL54 SEQ ID NO: 56 5-AACAAGTIGTAAAAGATXnTrATtnXXAG 

GenelO insert SEQ ID NO: 57 5 , -CTTTAAGAAGGAGATATACATATGGCTAG- 

20 (^TGACTCGTGGACAGCAAATGCATC 

GGAGACTTTCTGATG 

Example 1Q 
Arg U tRNA Expression 
The pattern of codon usage differs between Thermosipho africanus and E. coli , 
25 In the 3M coding sequence, arginine is most frequently coded for by the AGA codon, 
whereas this codon is used in low frequency in coli host cells. The corresponding 
Arg U tRNA appears in low concentrations in E. coli . The low concentration in the 
host cell of Arg tRNA using the AGA codon may limit the translation efficiency of the 
Taf polymerase gene. The efficiency of translation ofthe Taf coding sequence within 
30 an E. £oli host may be improved by increasing the concentration of this tRNA species 
by cloning multiple copies of the tRNA gene into the host cell using a second 
expression vector that contains the gene for the "Arg U" tRNA, 

The Arg U tRNA gene was PCR amplified from E. goH genomic DNA using 
the primers DG284 and DG285. The amplification product was digested with Sail and 
35 BamH L The ColEI compatible vector pACYC184, commercially available from 
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New England Biolabs, was digested with Sail and BamHL and die Arg U gene 
fragment was subsequendy ligated with the digested vector. DG101 cells were 
transformed, and the ligated vector was designated pARGOL Finally, DG1 16 host 
cells were co-transformed with pARGOl and pTaf03. 
5 The oligonucleotide primers used in this Example are listed below. 

Primers SEP ID NO: Sequence 
DG284 SEQ ID NO: 58 5M3GGGGATCCAAAAGCCA^ 
DG285 SEQ ID NO: 59 5^GGGGTCGAOGCATGCGAGGAAAATAGACG 

Example 11 ^ 

10 Purification of Recombinant Taf Polymerase 

Recombinant Taf DNA Polymerase can be purified from the expression 
host/vector combinations described, for example, & coli strain DG1 16 containing one 
of the expression vectors described in Example 4, above, using the following protocol. 
The seed flask for a 10 L fermentation contains tryptone (20 g/1), yeast extract 

15 (10 gA), Nad (10 g/1), glucose (10 g/i), ampicillin (50 mg/1), and thiamine (10 mg/i). 
The seed flask is inoculated with a colony from an agar plate (a frozen glycerol culture 
can be used). The seed flask is grown at 30°C to between 0,5 to 2.0 O.D. (A 680 ). The 
volume of seed culture inoculated into the fermentor is calculated such that the bacterial 
concentration is 0.5 mg dry weight/liter. The 10 liter growth medium contains 25 mM 

20 KH 2 P0 4 , 10 mM (NH&SO* 4 mM sodium citrate, 0.4 mM FeCi 3 , 0.04 mM ZnCl 2 , 
0.03 mM C0CI2, 0.03 mM Cud* and 0.03 mM H3BO3. The following sterile 
components are added: 4 mM MgS0 4 , 20 g/1 glucose, 20 mg/1 thiamine, and 50 mg/1 
ampicillin. Hie pH is adjusted to 6.8 with NaOH and controlled during the fermentation 
by added NH4OHL Glucose is continually added by coupling to NH4OH addition. 

25 Foaming is controlled by the addition of propylene glycol as necessary, as an 
antifoaming agent Dissolved oxygen concentration is maintained at 40%. 

The fermentor is inoculated as described above, and the culture is grown at 
30°C toacell density of 0.5 to 1.0 X 1010 cells/ml (optical density [A^] of 15). The 
growth temperature is shifted to between 37°C and 41 °C to induce the synthesis of Taf 

30 DNA polymerase. The temperature shift increases the copy number of the expression 
plasmid and simultaneously derepresses the lambda Pl promoter controlling 
transcription of the modified Taf DNA polymerase gene through inactivation of the 
temperature-sensitive cl repressor encoded by the defective prophage lysogen in the 
host. 

35 The cells are grown for 6 hours to an optical density of 37 (A^so) and harvested 

by centrifugation. The cell mass (ca. 95 g/1) is resuspended in an equivalent volume of 
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buffer containing 50 mM Tris-Cl, pH 7.6, 20 mM EDTA and 20% (w/v) glycerol- The 
suspension is slowly dripped into liquid nitrogen to freeze the suspension as "beads" or 
small pellets. The frozen cells are stored at -70°C 

To 200 g of frozen beads (containing 100 g wet weight cell) is added 100 ml of 
5 IX IE (50 mM Tris-Cl, pH 7-5, 10 mM EDTA) and Dithiothreitol (DTT) to 0.3 mM, 
phenylmethanesulfonyl flouride (PMSF) to 2.4 mM, leupeptin to 1 |ig/ml and 1^1- 
Chloro-3-[4-tosylamido]-7-amino-2-heptanone-Ha (TLCK) (the latter three are 
protease inhibitors) to 0.2 mM. The sample is thawed on ice and uniformly 
resuspended in a blender at low speed. The cell suspension is lysed in an Aminco 

10- french pressure cell at 20,000 psi. To reduce viscosity, the lysed cell sample is 

sonicated 4 times for 3 rain, each at 50% duty cycle and 70% output Hie sonicate is 
adjusted to 550 ml with IX TE containing 1 mM DTT, 2.4 mM PMSF, 1 ligfol 
leupeptin and 0.2 mM TLCK (Fraction J). After addition of ammonium sulfate to 0.3 
M, the crude lysate is rapidly brought to 75°C in a boiling water bath and transferred to 

15 a 75°C water bath for 1 5 min. to denature and inactivate J2. coli host proteins. The 
heat-treated sample is chilled rapidly to 0°C and incubated on ice for 20 min. 
Precipitated proteins and cell membranes are removed by centrif ugation at 20,000 X G 
for 30 min. at 5°C and the supernatant (Fraction II) saved. 

The heat-treated supernatant (Fraction B) is treated with polyethyleneimine 

20 (PEI) to remove most of the DNA and RNA. Polymin P (34.96 ml of 10% [w/v], pH 
7.5) is slowly added to 437 ml of Fraction II at 0°C while stirring rapidly. After 30 
min. at 0°C, the sample is centrifuged at 20,000 X G for 30 min. Hie supernatant 
(Fraction ID) is applied at 80 ml/hr to a 100 ml phenylsepharose column (3.2x12.5 cm) 
that has been equilibrated in 50 mM Tris-Cl, pH 7.5, 0.3 M ammonium sulfate, 10 mM 

25 EDTA, and 1 mM DTT. The column is washed with about 200 ml of the same buffer 
(A280 to baseline) and then with 150 ml of 50 mM Tris-Cl, pH 7.5, 100 mM KQ, 10 
mM EDTA and 1 mM DTT. The Taf DNA polymerase is then eluted from the column 
with buffer containing 50 mM Tris-Cl, pH 7.5, 2 M urea, 20% (w/v) ethylene glycol, 
10 mM EDTA, and 1 mM DTT, and fractions containing DNA polymerase activity are 

30 pooled (Fraction IV). 

Fraction IV is adjusted to a conductivity equivalent to 50 mM KCI in 50 mM 
Tris-Cl, pH 7.5, 1 mM EDTA, and 1 mM DTT. The sample is applied (at 9 nO/hr) to a 
15 ml heparin-sepharose column that has been equilibrated in the same buffer. The 
column is washed with the same buffer at ca. 14 ml/hr (3*5 column volumes) and 

35 eluted with a 150 ml 0.05 to 0.75 M KCI gradient in the same buffer. Fractions 

containing the Taf DNA polymerase are pooled, concentrated, and diafiltered against 
2.5X storage buffer (50 mM Tris-Cl, pH 8.0, 250 mM KCI, 0.25 mM EDTA, 2.5 mM 
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DTT, and 0.5% Tween 20), subsequently mixed with 1*5 volumes of sterile 80% (w/v) 

glycerol, and stored at -20°C 

Optionally, die heparin sepharose-eluted DNA polymerase or the phenyl 

sepharose-eluted DNA polymerase can be dialyzed or adjusted to a conductivity 
5 equivalent to 50 mM KQ in 50 mM Tris-Cl, pH 7.5, 1 mM DTT, 1 mM EDTA, and 

0J2% Tween 20 and subjected to nucleotide binding protein affinity chromatography. 

The polymerase containing extract is applied ( 1 mg protein/ml resin) to an affigel blue 

column that has been equilibrated in the same buffer. The column is washed with three 

to five column volumes of the same buffer and eluted with a 10 column volume KC1 
10 gradient (0.05 tq 0.8 M) in the same buffer. Fractions containing DNA polymerase 

activity are pooled, concentrated, diafiltered, and stored as above. 

Optionally, the pooled fractions can be subjected to cation exchange 

chromatography. The fractions are applied to a 2 ml CM-Tris-Aciyl M (LKB) column 

equilibrated with a buffer consisting of 25 mM sodium acetate, 20 mM Nad, 0. 1 mM 
15 EDTA, 1 mM DTT, and 0.2% Tween 20 at pH 5.0. The column is washed with 4-5 

column volumes of the same buffer and the enzyme eluted with a linear gradient form 

20 to 400 mM NaCl in sodium acetate buffer. Active fractions are pooled, 

concentrated, diafiltered, and stored as above. 
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Deposits 

The following deposit was made on the date given: 

Strain Deposit Date ATCCNo. 

pTaf02 

5 This deposit was made under the provisions of the Budapest Treaty on the 

International Recognition of the Deposit of Microorganisms for the Purposes of Patent 
Procedure and the Regulations thereunder (Budapest Treaty). This assures 
maintenance of a viable culture for 30 years from date of deposit The organism will be 
made available by ATCC under the terms of the Budapest Treaty, and subject to an 

10 agreement between Applicants and ATCC, which assures permanent and unrestricted 
availability of the progeny of the cultures to the public upon issuance of the pertinent 
ILS. patent or upon laying open to the public of any U.S* or foreign patent application, 
whichever comes first, and assures availability of the progeny to one determined by the 
U.S. Commissioner of Patents and Trademarks to be entitled thereto according to 35 

15 U.S.C. §122 and the Commissioner's rules pursuant thereto (including 37 CRR. 
§1.14 with particular reference to 886 OG 638). The assignee of the present 
application agrees that if the culture on deposit should die or be lost or destroyed when 
cultivated under suitable conditions, it will be promptly replaced on notification with a 
viable specimen of the same culture. Availability of the deposited strain is not to be 

20 construed as a license to practice the invention in contravention of the rights granted 
under the authority of any government in accordance with its patent laws. 
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1 . A purified thennostable DNA polymerase I enzyme that catalyzes 
combination of nucleoside triphosphates to form a nucleic acid strand complementary to 
5 a nucleic acid template strand, said enzyme derived from the eubacterium Thermosipho 
africanus. 



2. Hie enzyme of Claim 1 that has reverse transcriptase activity. 
10 3. The enzyme of Claim 1 that h£§ 5'-*y exonuclease activity. . 

4. The enzyme of Claim 1 that has 3'-»5 < exonuclease activity. 

5 . A method for purifying Themiosipho africanus DNA polymerase I, said 
15 method comprising the steps of: 

(a) preparing a crude cell extract from cells that produce said polymerase; 

(b) adjusting the ionic strength of said extract so that said polymerase 
dissociates from any nucleic acid in said extract; and 

(c) subjecting the extract to at least one step selected from the group consisting 
20 of: hydrophobic interaction, DNA binding protein affinity, nucleotide binding protein 

affinity, anion exchange, cation exchange, and hydroxyapatite chromatography step, 

6 . A recombinant DNA consisting essentially of a nucleotide sequence that 
encodes Taf DNA polymerase I activity. 

25 

7 • The DNA of Claim 6 that encodes the amino acid sequence from amino to 
carboxy terminus: 

MetGlyLysMetPheLeu 

30 PheAspGlyThrGlyLeuValTy r ArgAlaPheTyr Alal leAsp 

GlnSerLeuGlnThrSerSerGlyLeuHisThrAsnAlaValTyr 
G ly LeuThr LysMet Leul leLy sPheLeuLy sGluHi s I leSer 
IleGlyLysAspAlaCysValPheValLeuAspSerLysGlyGly 
SerLysLysArgLysAspIleLeuGluThrTyrLysAlaAsnArg 

35 ProSerThrProAspLeuLeuLeuGluGlnlleProTyrValGlu 
GluLeuValAspAlaLeuGlylleLysValLeuLysIleGluGly 
PheGluAlaAspAspIlelleAlaThrLeuSerLysLysPheGlu 
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Ser AspPheGluLy s ValAsn I lelleThrGlyAspLy s AspLeu 

LeuGlnLeuValSerAspLysValPheValTrpArgValGluArg 

GlylleThrAspLeuValLeuTyrAspArgAsnLysVallleGlu 

LysTyrGlylleTyrProGluGlnPheLysAspTyrLeuSerLeu 

ValGlyAspGlnlleAspAsnlleProGlyValLysGlylleGly 

LysLysThrAlaValSerLeuLeuLysLysTyrAsnSerLeuGlu 

AsnValLeuLysAsnlleAsnLeuLeuThrGluLysLeuArgArg 

LeuLeuGluAspSerLysGluAspLeuGlnLysSerlleGluLeu 

ValGluLeuIleTyrAspValProMetAspValGluLysAspGlu 

IielleTyrArgGlyTyrAsnProAspI^ysLeuLeuLysValLeu 

LysLysTyrGluPheSerSerllelleLysGluLeuAsnLeuGln 

GluLysLeuGluLysGluTyrlleLeuValAspAsnGluAspLys 

LeuLysLysLeuAlaGluGluIleGluLysTyrLysThrPheSer 

IleAspThrGluThrThrSerLeuAspProPheGluAlaLysLeu 

ValGlylleSerlleSerThrMetGluGlyLysAlaTyrTyrlle 

ProValSerHisPheGlyAlaLysAsnlleSerLysSerLeuIle 

AspLysPheLeuLysGlnlleLeuGlnGluLysAspTyrAsnlle 

ValGlyGlnAsnLeuLysPheAspTyrGluIlePheLysSerMet 

GlyPheSerProAsnValProHisPheAspThrMetlleAlaAla 

TyrLeuLeuAsnProAspGluLysArgPheAsnLeuGluGluLeu 

SerLeuLysTyrLeuGlyTyrLysMetlleSerPheAspGluLeu 

ValAsnGluAsnValProLeuPheGlyAsnAspPheSerTyrVal 

ProLeuGluArgAlaValGluTyrSerCysGluAspAlaAspVal 

ThrTyrArgllePheArgLysLeuGlyArgLysIleTyrGluAsn 

GluMetGluI-ysIieuPheTyrGluIleGliiMetProLeuIleAsp 

ValLeuSerGluMetGluLeuAsnGlyValTyrPheAspGluGlu 

TyrLeuLysGluLeuSerLysLysTyrGlnGluLysMetAspGly 

IleLysGluLysValPheGluIleAlaGlyGluThrPheAsnLeu 

AsnSe r Se r Thr Gin Va 1 AlaTy r I leLeuPheGluLy s Leu As n 

IleAlaProTyrLysLysThrAlaThrGlyLysPheSerThrAsn 

AlaGluValLeuGluGluLeuSerl-ysGluHisGluIleAlaLys 

LeuLeuLeuGluTyrArgLysTyrGlnLysLeuLysSerThrTyr 

IleAspSerlleProLeuSerlleAsnArgLysThrAsnArgVal 

HisThrThrPheHisGlnThrGlyThrSerThrGlyArgLeuSer 

SerSerAsnProAsnLeuGlnAsnLeuPiroThrArgSerGluGlu 

GlyLysGluIleArgLysAlaValArgProGlnArgGlnAspTrp 

TrpIleLeuGlyAlaAspTyrSerGlnileGluLeuArgValLeu 
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AlaHisValSerLysAspGluAsnLeuLeuLysAlaPheLysGlu 
AspLeuAspIleHisThrlleThrAlaAlaLysIlePheGlyVal 
SerGlxiMetPheValSerGluGlnMetArgArgValGlyLysMet 
ValAsnPheAlallelleTyrGlyValSerProTyrGlyLeuSer 
LysArglleGlyLeuSerValSerGluThrLysLysIlelleAsp 
AsnTyrPheArgTyrTyrLysGlyValPheGluTyrLeuLysArg 
MetLysAspGluAlaArgLysLysGlyTyrValThrThrLeuPhe 
GlyArgArgArgTyrlleProGlnLeuArgSerLysAsnGlyAsn 
ArgValGlnGltiGlyGlxiArglleAlaValAsnThrProIleGln 
Gly Thr AlaAlaAsp I le I leLy s I leAlaMet I leAsnllefii s 
AsnArgLeuLysLysGluAsnLeviArgSerLysMetlleLeuGln 
ValHisAspGluLeuValPheGluValPrpAspAsnGlxiLeuGlu 
IleValLysAspLeuValArgAspGluMetGluAsnAlaValLys 
LeuAspValProLeuLysValAspValTyrTyrGlyLysGluTrp 
Glu , which is SEQ ID: 2, 

8 . The DNA of Claim 7 that is SEQ ID NO: 3 : 

5 ' -ATGGGAAAGATGTTTCT A 
TTTGATGGAACTGG ATTAGTATACAGAGCATTTTATGCT ATAGAT 
CAATCTCTTCAAACTTCGTCTGGTTTACACACTAATGCTGTATAC 
GGACTTACTAAAATGCTTATAAAATTTTTAAAAGAACATATCAGT 
ATTGGAAAAGATGCTTGTGTTTTTGTTTTAGATTCAAT^AGGTGGT 
AGCAAAAAAAGAAAGGATATTCTTGAAACATATAAAGCAAATAGG 
CCATCAACGCCTGATTTACTTTTAGAGCAAATTCCATATGTAGAA 
GAACTTGTTGATGCTCTTGGAATAAAAGTTTTAAAAATAGAAGGC 
TTTGAAGCTGATGACATTATTGCTACGCTTTCTA2lAAAATTTGAA 
AGTGATTTTGAAAAGGTAAACATAATAACTGGAGATAAAGATCTT 
TTACAACTTGTTTCTGATAAGGTTTTTGTTTGGAGAGTAGAAAGA 
GGAATAACAGATTTGGTATTGTACGATAGAAATAAAGTGATTGAA 
AAATATGGAATCTACCCAG2UVCAATTCAAAGATTATTTATCTCTT 
GTCGGTGATCAGATTGATAATATCCCAGGAGTTAAAGGAATAGGA 
AAGAAAACAGCTGTTTCGCTTTTGAAAAAATATAATAGCTTGGAA 
AATGTATTAAAAAATATTAACCTTTTGACGGAAAAATTAAGAAGG 
CTTTTGGAAGATTCAAAGGAAGATTTGCAA2UVAAGTATAGAACTT 
GTGGAGTTGATATATGATGTACCAATGGATGTGGAAAAAGATGAA 
ATAATTTATAGAGGGTATAATCCAGATAAGCTTTTAAAGGTATTA 
AAAAAGTACGAATTTTCATCTATAATTAAGGAGTTAAATTTACAA 
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GAAAAATTAGAAAAGGAATATATACTGGTAGATAATGAAGATAAA 

TTGAAAAAACTTGCAGAAGAGATAGAAAAATACAAAACTTTTTCA 

ATTGATACGGAAACAACTTCACTTGATCCATTTGAAGCTAAACTG 

GTTGGGATCTCTATTTCCACAATGGAAGGGAAGGCGTATTATATT 

CCGGTGTCTCATTTTGGAGCTAAGAATATTTCCAAAAGTTTAATA 

GATAAATTTCTAAAACAAATTTTGCAAGAGAAGGATTATAATATC 

GTTGGTCAGAATTTAAAATTTGACTATGAGATTTTTAAAAGCATG 

GGTTTTTCTCCAAATGTTCCGCATTTTGATACGATGATTGCAGCC 

TATCTTTTAAATCCAGATGAAAAACGTTTTAATCTTGAAGAGCTA 

TCCTTAAAATATTTAGGTTATAAAATGATCTCGTTTGATGAATTA 

GTAAATGAAAATGTACCATTGTTTGGAAATGACTTTTCGTATGTT 

CCACTAGAAAGAGCCGTTGAGTATTCCTGTGAAGATGCCGATGTG 

ACATACAGAATATTTAGAAAGCTTGGTAGGAAGATATATGAAAAT 

GAGATGGAAAAGTTGTTTTACGAAATTGAGATGCCCTTAATTGAT 

GTTCTTTCAGAAATGGAACTAAATGGAGTGTATTTTGATGAGGAA 

TATTTAAAAGAATTATCAAAAAAATATCAAGAAAAAATGGATGGA 

ATTAAGGAAAAAGTTTTTGAGATAGCTGGTGAAACTTTCAATTTA 

AACTCTTCAACTCAAGTAGCATATATACTATTTGAAAAATTAAAT 

ATTGCTCCTTACAAAAAAACAGCGACTGGTAAGTTTTCAACTAAT 

GCGGAAGTTTTAGAAGAACTTTCAAAAGAACATGAAATTGCAAAA 

TTGTTGCTGGAGTATCGAAAGTATCAAAAATTAAAAAGTACATAT 

ATTGATTCAATACCGTTATCTATTAATCGAAAAACAAACAGGGTC 

CATACTACTTTTCATCAAACAGGAACTTCTACTGGAAGATTAAGT 

AGTTCAAATCCAAATTTGCAAAATCTTCCAACAAGAAGCGAAGAA 

GGAAAAGAAATAAGAAAAGCAGTAAGACCTCAAAGACAAGATTGG 

TGGATTTTAGGTGCTGACTATTCTCAGATAGAACTAAGGGTTTTA 

GCGCATGTAAGTAAAGATGAAAATCTACTTAAAGCATTTAAAGAA 

GATTTAGATATTCATACAATTACTGCTGCCAAAATTTTTGGTGTT 

TCAGAGATGTTTGTTAGTGAACAAATGAGAAGAGTTGGAAAGATG 

GTAAATTTTGCAATTATTTATGGAGTTTCACCTTATGGTCTTTCA 

AAGAGAATTGGTCTTAGTGTTTCAGAGACTAAAAAAATAATAGAT 

AACTATTTTAGATACTATAAAGGAGTTTTTGAATATTTAAAAAGG 

ATGAAAGATGAAGCAAGGAAAAAAGGTTATGTTACAACGCTTTTT 

GGAAGGCGCAGATATATTCCACAGTTAAGATCGAAAAATGGTAAT 

AGAGTTCAAGAAGGAGAAAGAATAGCTGTAAACACTCCAATTCAA 

GGAACAGCAGCTGATATAATAAAGATAGCTATGATTAATATTCAT 

AATAGATTGAAGAAGGAAAATCTACGTTCAAAAATGATATTGCAG 
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GTTCATGACGAGTTA6TTTTTGAAGTGCCCGATAATGAACTGGAG 



ATTGTAAAAGATTTAGTAAGAGATGAGATGGAAAATGCAGTTAAG 



CTAGACGTTCCTTTAAAAGTAGATGTTTATTATGGAAAAGAGTGG 



GAATAA 



5 



9. 



A recombinant DNA vector that comprises the DNA sequence of claim 6. 



10. The recombinant DNA vector of Claim 9 that is a plasmid selected from the 
group consisting of pTafiOl, pTapQ2, pTaf03, pTaf04, pTaf05, pTafl)6, pTaf07, 

10 pT^8 t pTaa3,pTafl4,pTafl5,pTafl6,andpBSM:TafEcoRV. 

11. A recombinant DNA consisting essentially of a nucleotide sequence that 
encodes the amino acid sequence consisting of amino acids 1 and 94 through 892 of 
SEQIDNO: 2. 

15 

12. A recombinant DNA consisting essentially of a nucleotide sequence that 
encodes the amino acid sequence consisting of amino acids 1 and 140 through 892 of 
SEQIDNO; 2. 

20 13. A recombinant DNA vector that comprises the DNA of Claim 1 1 . 



25 consisting of pTaf09 and pTaflO. 

16. A recombinant DNA vector of Claim 14 that is selected from the group 
consisting of pTafl 1 and pTaf!2. 



14. 



A recombinant DNA vector that comprises the DNA of Claim 12. 



15. 



A recombinant DNA vector of Claim 13 that is selected from the group 
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17. 



A recombinant host cell transformed with a vector of Claim 9. 



18. 



The recombinant host cell of Claim 17 that is R coli. 



19. The DNA of Claim 7 which has been mutated to cause the resultant 
35 polymerase activity to lack 5'-^3 f exonuclease activity. 
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20. ThcDNAof Claim 19in which the mutation is a substitution of an Asp 
codon for the Gly oodon at position 37. 

21. "The DNA of Claim 19 in which the mutation is a deletion up to and 
5 including the Gly codon at codon 37. 
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